Device To Cloud Connectivity with Azure IoT Hub

Image for post
Image for post

IoT (Internet of Things) and IIoT (Industrial Internet of Things) are very common words come to our mind when we buy a new smart electronic appliance or we drive a new car or think about a sophisticated manufacturing plants.

To make the Things really intelligent not only we capture the events generated from these, we analyze the events, predict the future, visualize the details and act based on the analysis & prediction.

Depending on the use case, all or few of the above steps can be performed locally (on-premise) or, events can be transferred into cloud for complex processing.

As an example, different sophisticated devices are now coming up with its own edge computing AI/ML features embedded in the chip.

On the other hand for IIoT cases, there are software available at the plant level to display KPIs or highlight the alerts.

For large number of independent IoT devices e.g. set of sensors, cameras we may also want to integrate the events through an integration layer before pushing the data onto cloud.

Whatever it is, no one can deny the importance of cloud services to process IoT/IIoT data due to their high processing capabilities (with advanced AI/ML features), storage capabilities and enterprise data integration which may not be available in the other layers.

In this blog, we’ll take a very simple device to cloud (D2C) scenario and we will send streams of events to Azure cloud to process those in real & batch time.

The Architecture we’ll follow

To process the event streaming in real time and batch we’ll use the famous Lambda architecture. As soon as the events are pushed into the Azure IoT Hub, those will be routed into two paths:

  1. Hot Path — Stream Analytics to consume the events and output into a PowerBI visualization dashboard with little/no transformation.
  2. Cold Path — Storing the events into Azure Storage Account, transforming these periodically (any complex transformation goes here) and store as Hive tables; PowerBI visualizations to be periodically refreshed using the aggregated data from Hive.

Further implementation details are as follows:

Setting up an Azure IoT Hub

In the first step, we’ll create an Azure IoT Hub and register a device.

Once the ‘device’ has been setup, we’ll copy the Primary Connection String and verify if it’s enabled.

IoT Hub Message Routing

We’ll go to our IoT Hub and create couple of routing paths:

Path 1 — Azure Storage Account — Blob:

Path 2 — Stream Analytics: We’ll select the built-in endpoint — events and Device Telemetry Messages as data source.

Configuring the ‘Thing’

In this example, we’ll treat a laptop/computer as the ‘Thing’ i.e. a IoT device , using Azure IoT hub device SDKs. The SDKs are very useful to build apps which can run directly on the devices and send telemetries to the IoT Hub. Using the Python SDK, we’ll send laptop CPU & memory information to our IoT Hub.

Find below a sample code using the SDK:

Batch Processing — Using Azure Databricks with Azure Data Factory

Once events reach to the IoT Hub, it emits the events to the Azure Storage we have configured. The data will be stored in appropriate partitions.

We’ll use Apache Spark (Azure Databricks) to read the records, aggregate as per our requirements and store as Hive tables. Azure Databricks notebook will be invoked by an Azure Data Factory v2 pipeline at regular intervals (e.g. daily).

We can use the following code to read the JSON blobs with appropriate schema:

Select the required columns:

Calculate the average & maximum CPU%:

Calculate the total number of alerts received because of higher CPU usages:

Save the CPU usages & CPU alert Spark DataFrames into Data Lake:

Create two Hive tables on the Data Lake locations:

(In our architecture, we may need to provision an Azure SQL Database or Cosmos DB to store reference data.)

Configure Stream Analytics

We’ll create an Azure Stream Analytics instance, add the IoT Hub as an input and Power BI as output. For detailed steps follow here.

The Stream Analytics query will look like the following (if we’re not transforming the input):

We can start the Stream Analytics job once we’re happy with the configurations.

Power BI Dashboard

Once the Power BI service refreshes the datasets from Hive tables (e.g. daily refresh) and real time events are coming via Stream Analytics, the Power BI widgets will start displaying the graphs. In the following image, top two widgets in white background are updated in real time whereas the following two black widgets are getting refreshed daily from the two Hive tables (CPU_Info, CPU_Alert). We can use Power BI mobile apps to access the dashboards while we’re on move!

Points to note

  • For simple one direction D2C connectivity we can use Azure Event Hub as this will be cheaper than IoT Hub.
  • For D2C and C2D (cloud to device) connectivity we need to use Azure IoT Hub. Verify this before taking the decision.
  • To connect multiple devices we can use some integration services like Sigfox and channelize the integrated/clubbed events to IoT Hub or Event Hub based on C2D/D2C scenarios.
  • For IIoT cases, generally factory sensors, devices are kept inside a private network and those are configured in an OPC server e.g. KEPServerEX. KEPServerEX can be connected to Azure IoT Hub directly or via Azure IoT Edge.

In the next blog, we’ll talk about configuring KEPServerEX to connect Azure IoT Hub directly to transfer events generated from factory devices.

Thanks for reading. If you have enjoyed it, don’t forget to clap and share! To see similar posts, follow me on Medium & LinkedIn.

Written by

Tech enthusiast, Azure Big Data Architect.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store