One of the biggest challenges with the internet of things lies in finding a balance between gathering high-resolution data from sensors and aggregating the most relevant exceptions or trends in the cloud. One emerging trend lies in doing some processing of this data closer to the edge, and then only sending exceptions or summaries into the cloud for storage and further processing.
“Traditionally, IoT edge analytics meant pushing IoT data to the cloud anddumping it into a data lake for big data analytics,” said John Crupi, VP and engineering system architect at Greenwave Systems Inc. “As much as we wished this would be sufficient for real-time analytics, it fell very short. It didn’t matter if we had the fastest computers running Spark. Architecturally, it wasn’t efficient.”
Combining cloud and edge analytics is particularly relevant to embedded analytics, real-time analytics and time-critical insight, Crupi said. Embedded analytics allows gateways and devices to make their own decisions independent of the cloud. Real-time analytics makes it possible to make decisions at the about the same time that events happen. Time-critical insights are important for responding to real-time events across a large number of devices, such as cyberattacks.
“Distributed analytics are useful when data is too big to be transferred to the cloud or when bandwidth between sensors and analytics servers is limited,” said Srinath Perera, vice president of research at WSO2 Inc. This is important when the sensors are geographically distributed, the analysis is too complicated or data is too big to be analyzed in a single machine. Additionally, when the time taken to transfer the data is high, the validity of some critical data will be lost.
Distributed analytics can be done on the sensor itself or via gateways placed near the sensors. Some of the tools for building out these distributed analytics architectures include WSO2 Siddhi, Apache Edgent and Apache MiNiFi. IoT edge analytics algorithms running on or near devices are a subclass of stream processing analytics. Stream processing systems running in the cloud include Apache Kafka, Apache Flink, Apache Storm, Apache Spark Streaming and WSO2 Data Analytics Server.
Bringing stream processing to the edge
For Full Story, Please click here.