Cloud Economics: Does Location Matter?

If you’ve been following the recent blogs, you’ll know the “L” in Joe Weinman’s C L O U D definition stands for location independence.  One of the five distinctive attributes of cloud computing, location independence means that you can access your data anywhere.  Location doesn’t matter.

Or does it?  Like many things in life, there is a trade-off.  Time is related to distance, even in cloud computing.  The farther you are from your data source, the longer it takes for the data to reach you.  And since timeliness has value, a better location should give better value.  So maybe location does matter after all.  The question is, how much?

Let’s put things into perspective by translating distance into time.  The calculated speed of data flowing through a fiber optic cable is about 125 miles per millisecond (0.001 seconds).  In real-world terms, since Chicago is located about 800 miles from New York City, it would take about 6.4 milliseconds for a “Hello world” message to traverse that distance.

As we discussed last week, for certain automated trading platforms that operate in the realm of microseconds (0.000001 seconds), 6.4 milliseconds is an eon of lost time.  These systems can make or lose millions of dollars at the blink of an eye.  For that reason you’ll find the serious players setting up shop right next door to their data center.  The rest of us, on the other hand, can pretty much remain in our seats, even for real-time cloud applications.

Why?  Well, first of all, the majority of industrial applications are already optimized for location.  Most SCADA systems are implemented directly inside a plant, or as close as physically practical to the processes they monitor.  Engineers who configure wide-area distributed systems are well aware of the location/time trade-offs involved, and take them into account in their designs.  Furthermore, they keep their mission-critical data communication self-contained, not exposed to the corporate LAN, much less to potential latencies introduced by passing data through the cloud.

Of course, a properly configured hybrid cloud or cloud-enhanced SCADA can separate the potential latencies of the cloud system from the stringent requirements of the core system.  What results is a separation between the deterministic response of the control system and the good-enough response time of the cloud system, which we have defined in a previous blog as “remote accessibility to data with local-like immediacy.

Another area where the location question arises is for the Internet of Things.  As we have seen, great value can be derived from connecting devices through the cloud.  These of course can be located just about anywhere, and most of them can send data as quickly as required.  For example, devices like temperature sensors, GPS transmitters, and RFID chips respond to evironmental input that is normally several orders of magnitude slower than even a slow Internet connection.  Latencies in the range of even a few hundred milliseconds make little difference to most users of this data.  People don’t react much faster than that, anyway.

As we have already seen, user interactions with a cloud system have a time cushion of about 200 milliseconds (ms), the average human response time.  How much of that gets consumed by the impact of location?  Joe Weinmann tells us that the longest possible round trip message, going 1/2 way around the world and back, such as from New York to Singapore and back to New York, takes about 160 ms.  Not bad.  That seems to leave some breathing room.  But Weinmann goes on to point out that real-world HTTP response times vary between countries, ranging from just under 200 ms to almost 2 seconds.  And even within a single country, such as the USA, average latencies can reach a whole second for some locations.

However, a properly designed real-time cloud system still has a few important cards to play.  Adhering to our core principles for data rates and latency we recall that a good real-time system does not require round-trip polling for data updates.  A single subscribe request will tell the data source to publish the data whenever it changes.  With the data being pushed to the cloud, no round trips are necessary.  This elimination of the “response” cycle cuts the time in half.  Furthermore, a data-centric infrastructure removes the intevening HTML, XML, SQL etc. translations, freeing the raw data to flow in its simplest form across the network.

What does this do to our Singapore-to-New York scenario?  Does it now approach 80 ms?  It’s quite possible.  Such a system would have to be implemented and tested under real-world conditions, but there is good reason to believe that for many locations with modern infrastructures, data latency can be well under the magic 200 ms threshold.  To the extent that this is true, location really does not matter.

SCADA and the Cloud – FUD and Facts

A lot of information and questions have been swirling through the industrial automation community over the past year or two regarding SCADA (Supervisory Control And Data Acquisition) and the cloud.  The din of voices from seasoned users, visonary cloud proponents and industry gurus has made it difficult sometimes to distinguish between true benefits, realistic options, inflated hype, and ominous warnings.  Some vendors, who are apparently more concerned about their slice of the SCADA market than helping the conversation, are adding a dash of FUD (fear, uncertaintly, and doubt) into the mix.  Before holding any serious discussion, we’d like to address these issues.

FUD: Putting a SCADA system in the cloud is risky and unwise.
Fact: Agreed.  Don’t do it.  Instead, use the cloud to enhance a SCADA system.

Answers and Questions signpostLet’s start by eliminating the main FUD factor right from the get-go.  Nobody expects to plop a SCADA system on the cloud and have it perform as well as running it in-house.  The technology is still evolving.  What is possible right now is to extend or enhance a SCADA system by connecting it to a real-time cloud system.  Here is how the concept of SCADA enhanced by the cloud cuts through the typical FUD:

Performance

FUD: SCADA in the cloud will impact your system performance.
Fact: Cloud-enhanced SCADA keeps primary control in the plant with zero impact on system performance, while any connection to the cloud should meet the core requirements for real-time cloud for performance.

FUD: SCADA in the cloud will have speed and latency issues.
Fact: Cloud-enhanced SCADA systems can support high data rates and low latency.

FUD: SCADA in the cloud means long polling cycles.
Fact: Cloud-enhanced SCADA can be implemented on a publish/subscribe, event-driven basis, with no polling necessary.

FUD: SCADA in the cloud would require several layers of protocol conversion, resulting in poor performance.
Fact: Cloud-enhanced SCADA can be implemented using a data-centric infrastructure, eliminating the need for protocol conversion until the data arrives at its destination.

Security

FUD: SCADA in the cloud exposes your process to hackers and spies.
Fact: Cloud-enhanced SCADA keeps your process running safely in the plant, behind closed firewalls.

FUD: Cloud hosts are more vulnerable to being hacked than in-house systems.
Fact: Cloud hosts typically invest far more in security than most manufacturing companies.

FUD: SCADA in the cloud exposes sensitive data on a public network.
Fact: Cloud-enhanced SCADA should allow you to select which data points you send to the cloud and protect them with encryption and access control restrictions, if necessary.

Reliability

FUD: SCADA in the cloud means that a connection failure equals system failure and costly plant downtime.
Fact: Cloud-enhanced SCADA means that a connection failure causes momentary loss of non-essential remote HMI interfaces.  The primary control system continues to run, because it is completely independent of the cloud system.

FUD: SCADA in the cloud is vulnerable to hosting service outages.
Fact: Many hosting services support 99.9% and better up-time.  In addition, a properly designed cloud-enhanced SCADA system can provide fully redundant data paths from inside the plant firewall to inside the client firewall.

These are a few examples of how to clear up any fear or doubt, using the approach of enhancing SCADA with cloud computing.  From this perspective we can now hold a more meaningful conversation.  Next week we’ll consider some of the more practical questions: What does cloud-enhanced SCADA look like?  What can it do for me?  How can I use it to get the most out of my real-time data?

Realistic “Real Time”

As we look over the technical definitions for real-time computing, it is clear that there is no way the cloud can support hard real time performance.  The cloud is not deterministic.  Despite advances in networking technology, there will always be delays and network breaks, making it impossible to implement any kind of hard real-time system where missing a deadline would cause a total system failure.  If we want to talk about real-time data in the cloud, we need to be clear on what we mean by real time.  We need a realistic definition of “real time” for the cloud.

Someone who has done a lot of thinking about how to define “real time” is E. Douglas Jensen.  Jensen has spent over 30 years developing real-time systems for industrial and military applications in institutions like Carnegie Mellon University, Hewlett Packard, and Honeywell.  More recently he has worked as a consultant for MITRE and the U.S. Department of Defense, as well as in his own consulting practice, Time-Critical Technologies (TCT).

According to Jensen, once you step away from the clear-cut definition of “real time” as hard real time, things get vague.  His experience is that researchers and academics working in the lab have created good theoretical models for real-time computing, but when we attempt to apply those theories in real-world situations, they are often not practical. This is particularly true in distributed systems, and the cloud is the ultimate distributed system.

On his website, Real-time for the Real World, Jensen offers a new approach to thinking about real-time systems in a realistic way.  First, he says that anything other than the concept of “hard real time” has been difficult to pin down and define, even by the experts. Second, it is tough to do any kind of experimental research on real-time systems in the real world because they are typically big, complicated, expensive, and mission-critical.  He then offers a reformulation of the idea of real time that has enabled him to construct highly complex real-time computing systems.

Jensen boils the issue down to two concepts: 1) time constraints (or deadlines) and 2) achieving acceptably optimal system performance within those constraints.  He says, “Real-time computing is about satisfying time constraints acceptably well with acceptable predictability—according to application- and situation-specific acceptability criteria, given the current circumstances.”

As in real life, time constraints can vary in immediacy and importance.  Some deadlines might be like catching a plane flight—not immediate, but important.  You may have some time to get to the gate, but when the plane takes off, you’d better be on board.  Other deadlines might be more like a phone call, which is always immediate but may not be important.  You get no advance warning when the phone rings, but if you don’t answer until the third or fourth ring, no problem.  Who is calling determines the importance, and sometimes even letting it go to voice mail may be just fine.

Also in real life, many things can happen at once.  While you are rushing to catch the flight, you realize that you left your ticket at home.  Just then your phone rings, someone bumps you from behind, you drop your bag, and then you notice that your wallet is missing.  What to do first?

Real-time computing systems are often put under similar demands, and need to respond in the best possible way.  An optimized system would meet all hard deadlines as often as possible, and minimize the number of soft deadlines missed and/or minimize the lateness of the response time.  Generally speaking, this is what it means to achieve optimal system performance within given time constraints.

A realistic approach to real-time establishes that each system, large or small, fast or slow, will have its own particular time constraints and acceptable level of optimal performance. Naturally, that can include cloud systems.  Taking this approach, we are now ready to look at the demands of some typical real-time systems, and how they can best be met when moved to the cloud.

Technically “Real Time”

For the past few months we’ve been talking about cloud computing, and how it might be able to support the flow of data in real time.  We’ve examined some of the different kinds of clouds, and highlighted their benefits, but we haven’t yet touched on the idea of “real time”.  What exactly does the “real-time” in “Real-Time Cloud” mean?  Is it really even possible to implement cloud computing for real-time data?

Technically speaking, a real-time computing system is 100% predictable and deterministic.  It guaratees a response within very narrow time constraints.  A missed deadline results in total system failure.  For example, the computer controlled anti-lock braking system in your car is a real-time system.  If the signal is processed too late, you could lose control of the car, resulting in serious damage or injury.

Aircraft and missle defense systems often use real-time computing.  According to Answers.com, the US Defense Department Military Dictionary defines “real time” like this: “Pertaining to the timeliness of data or information which has been delayed only by the time required for electronic communication. This implies that there are no noticeable delays.”

Real-time processing is sometimes distinguished from batch processing.  In batch processing, a number of tasks are gathered by a system and then processed together, in a batch.  In contrast, in real-time processing, each task is processed immediately, in a continual flow of input, processing, and output.  This mode of processing is generally expected to produce output in synch with real time.

Clock and speed image.Another aspect of real-time processing in some systems deals with regulating the rate of computation to go exactly the same speed as time passes in our physical reality–neither slower nor faster.  Like a video that can be sped up or put into slow motion, such real-time systems are designed to synchronize with the actual rate of time experienced by the system they model.

People also speak about real time with varying degrees of hardness.  A real-time system where missing a deadline results in total system failure is often referred to as “hard” real time.  However, there are many real-time systems in which being a little late occasionally does not spell absolute disaster.  If a deadline can be missed every now and then, a system may be referred to as “firm” real time.  Or if a slightly late response is still somewhat useful, and not a complete loss, then the system is sometimes known as a “soft” real-time.

Along these lines, Answers.com gives another entry from the US Defense dictionary for “near real time,” defining it as follows: “Pertaining to the timeliness of data or information which has been delayed by the time required for electronic communication and automatic data processing. This implies that there are no significant delays. Also called NRT.”

How does all of this apply to cloud computing?  Clearly, with the unpredictability of the Internet, we would be pretty far stretched to categorize a real-time cloud system as hard real time.  So what is real-time cloud computing, then? Is it soft real time?  Or near real time?  Or something else?  Next week we’ll see what one highly-regarded expert in the field of real-time systems has to say about real-world real time.

CEO Perspectives 4: Leading the Transformation

Is cloud computing inevitable?  Some people seem to think so.  Try typing the words “cloud,” “computing,” and “inevitable” into Google and you’ll get millions of hits.  Last year cloud computing reached a peak on the Gartner Hype Cycle.  While the more conservative players are willing to sit back and take a wait-and-see approach, a growing number of companies are diving in, and leading the transformation.

According to Andrew McAfee in his article What Every CEO Needs to Know About the Cloud, there is a gradual but inevitable shift toward the cloud.  He expects those who get in early to be in an increasingly better position as time passes, while those who linger to be put at a greater and greater disadvantage, until they either join or get lost in the dust.  McAfee gives some general guidelines for starting a move into the cloud, which can be used by anyone interested in putting real-time data on the cloud.

Know Your Responsibilities
To start with, McAfee suggests becoming aware of legal implications.  Clouds in the sky have no respect for man-made concepts like country borders, but your data does not have the same luxury.  Some countries limit what kinds of data can be moved or stored outside their borders.  For example, the EU Data Protection Directive restricts data on personal status from passing through countries that do not provide an “adequate level of protection” for the data.  Other countries have  strict privacy laws for any data transmitted on a public network or stored in a cloud server.  You will need to verify that your system meets the legal requirements of all countries in which you expect it to function.

Understand the Risks
We talked about security risks last week, pointing out that they may be different than common wisdom would suggest.  Questions about cost and reliability were addressed in an earlier blog.  McAfee advises executives to become informed of the risks and limitations of cloud computing, involving their general counsels and compliance departments early on.  There are a few areas, such as data subject to export regulations or related to personal health information, that may warrant a conservative approach, but in general he advocates boldly moving forward.  Plant managers and engineers will of course need to take a close look at their specific circumstances to decide what parts of their data sets can be made available in a cloud application.

Evaluate Attitudes
As with any new undertaking, there will be different levels of interest and willingness to change, both within the organization and outside.  Those most eager to implement a real-time cloud system will need to gauge its appeal among key decision makers and managers who are expected to implement it.  McAfee says, “a CIO’s lack of enthusiasm about the cloud these days is about as red a flag as a factory manager’s disinterest in electrification would have been a century ago.”

At the same time, consider software vendors.  What is their attitude toward cloud computing?  What plans do SCADA suppliers and other providers of software for real-time applications have in place to support a move to the cloud?  Some may add the word “cloud” to their networked process control software, but does it really meet the core requirements for a real-time cloud system?

Experiment
Having done your homework, you are ready to try it.  McAfee suggests starting small.  Experiment.  He talks about non-real-time business systems, but the principle is the same.  Don’t expect to immediately move a whole SCADA system onto the cloud.  Maybe you can implement a web-based HMI to present a limited data set to selected customers.  Or possibly connect remote field devices to a cloud server for monitoring in a web browser.  As you gain experience, you may want to set up a private or a hybrid cloud.  Then, as time passes and cloud computing goes even more mainstream, you’ll be in a position to consider expanding further.

It is still too early in the history of cloud computing to know with absolute certainty that this is indeed the way of the future.  But things have reached a point where it would probably be wise to consider it seriously.  As consumer and business applications increasingly move into the cloud, real-time solutions won’t be far behind.  Somewhere between head-in-the-sand and off-the-deep-end, McAfee suggests a cautious, realistic, small-scale, try-and-see attitude to gain experience and build capabilities that may prove valuable in the near future.