Ah, the Internet of Things. If you're like me, then you'll have started hearing this term a good few years ago (along with much hype about how your toaster would soon be ordering its own bread/bossing the kettle about/taking over the world).
Even though my toaster remains stubbornly stupid, the reality is that so many industrial and digital devices are now being created with the in-built ability to communicate vast amounts of information about themselves that the business use cases for Internet of Things solutions are now really starting to emerge. Huge datasets, when combined with tools that are equipped to consume and provide insights from them are increasingly being used by forward-thinking industry leaders (as well as disruptive innovators) to harness the power of data to drive business goals.
So that's the use case we'll examine today - that of a manufacturing company who have invested in a couple of IoT-ready CNC Machines (the kind used to grind blocks of metal down into intricate parts and components) who have decided to start consuming, visualising and acting upon the data being produced by these new additions to their factory floor.
For the uninitiated, all of the DX Demo posts on the site will follow a similar format of taking a tech tutorial or course (or multiples thereof) and creating/expanding upon the business use cases that the solution could support. This is the first one though, so I'm sure we may iterate on the details along the way 🙂 Suggestions gratefully received via the comments section.
This post is no exception - the main source for this DX demo is the excellent tutorial Processing MTConnect data using Azure IoT hub, Stream Analytics, Azure Functions and Power BI live reports by The tutorial focuses on building a cloud-based solution which consumes and processes a live data stream from a couple of internet-connected CNC machines, then covers connecting the data stream to Power BI (Microsoft's well-known Business Intelligence Tool) to provide visualisations of the dataset.
Note that I'll be making a couple of tweaks to the steps outlined in the tutorial here and there, just to make the overall DX Demo more accessible in a business context.
I also used Microsoft's own documentation to help with throttling the amount of data in my streaming dataset.
In order to build up the business context around the tech solution, I've put together a mock scenario - this should hopefully make the content accessible and relevant to all readers, and help to highlight the practical use cases of the technology.
- Business Name: DXB Motor Parts
- Overview: DXB Motors make high quality automotive components for various manufacturers, based either on designs from their own in-house innovation team, or via order direct from their customers. Their main use case is milling aluminium stock material into high-performance engine components.
- Business process: The diagram below shows the business process as relates to the CNC machines:
- Aim: DXB Motors hope that by leveraging the power of the datasets being produced by their new IoT-ready CNC machines, they will be able to bring multiple improvements and innovations to their production lifecycle.
Now that we've defined our mock business, let's take a look at our mock audience. In order for us to contextualise the technical output from the tutorial, I think it's important for us to attempt to put ourselves in the shoes of some of the roles that would have a stake in the digital solution we're looking at - these people represent the Customer Voice, so it's useful for us to look at what their expectations are here at outset. Later, we'll look at the extent to which the solution has met them.
Once we have the whole solution up and running, I'm sure we'll identify a few more audience, but these will serve for now.
STEP ONE - WHERE'S THE DATA?
Well, to start with, it's on the factory floor - the CNC machines which grind/cut/mill the products from slabs of metal generate tons of data themselves (typically information about the 3D position of the material/cutting head expressed in X, Y and Z axes, but they can also transmit information about the amount of load being placed on components, their temperature as well as other data points such as alarm statuses.
The data can be combined with other data sources that might be useful to the machine owner - for example, the ambient temperature on the shop floor, or the vibration being recorded by external accelerometers placed strategically on or around the machine. The business goal will be to accurately record the actions of the device as well as any other peripheral data that may contribute to the overall picture of its performance.
In the absence of having our own factory floor replete with various cutting edge CNC machines, we'll have to settle for a substitute. Thankfully, we can connect to a publicly available data stream of CNC machine data. The helpful people at MIST have exposed a dataset which will emulate some CNC machine data for us.
The tutorial informs us where to get the data - it's available from this URL, which, when accessed repeatedly, will provide a non-stop streaming dataset containing exactly the kind of data we'd want to examine on our IoT dashboard. This is very close to the true scenario we'd follow if we really did have our own machines - if we did, each of them would be transmitting data to a central controller device, which would amalgamate the data into a single stream then publish it to the web, making all of our data available from a single URL (although for commercial security, we probably wouldn't publish to a URL which was accessible to Joe Public 🙂 )
STEP TWO - ASSESSING OUR DATA SOURCE
The data our machines are producing is continually changing. To understand what that means it's probably helpful to simply open the site and take a look at the data shown there. Try refreshing the page a couple of times, and pay attention to the 'timestamp' field - as you can see the data is changing constantly, with each new XML document being given a different timestamp to show which moment in time the data export relates to. It's basically a procession of documents giving a detailed account of what was happening over the course of every second that the machine has been running.
It's worth taking a moment to scan through the data and look at the types of fields which are being reported on. This is the data that we intend to ‘listen to’, capture and push into our Azure platform for processing before we can start working with it to solve business problems.
Ultimately, we'll need a component to do our 'listening', grab the data and then forward it to our Azure cloud computing platform, but first we'll need to set up the place where our data will be sent...
STEP THREE - WHERE DOES IT ALL GO?
Clearly there are a few different options available when thinking about cloud platforms capable of ingesting our stream of data, but based on the tutorial we're following today, the answer is that our data will be getting forwarded to, consumed and processed by a series of standard components available on Microsoft's Azure platform.
Before we start sending data, we need to have a place for it to be picked up and imported into Azure. The component in Azure that will perform this function is call an IoT Hub. This is essentially a component which is designed to consume a stream of data and make it available to other components within Azure. The setup process to do this wqas relatively straightforward - once you've selected the IoT Hub resource type in the Azure portal, there are only a handful of parameters to fill in. Once it's set up, you'll need to copy the endpoint URL which is displayed for the hub - this URL is the definitive answer to our 'where does the data go' question.
STEP FOUR - FORMATTING AND FORWARDING
Now that we have a place to send our data, we'd better get on with doing just that.
The data being generated by our two mock machines is currently in a format called MTConnect (a protocol used by many CNC machines, and which is communicated as XML documents). Not all machines send their data like this, but it's pretty popular, and for those that don't send it like this, engineers often put an 'adaptor' application in place which consumes the output from their machine and converts it into MTConnect.
This isn’t to say that MTConnect is perfect though. It's an XML format, which isn't really ideal for using as an input for modern applications (it’s more difficult to parse than comparable formats such as JSON). It's also just one-way - i.e. machines can send MTConnect telemetry out to report what they are doing, but they can't receive MTConnect telemetry in as a way of receiving instructions. That's a shame, as one of the key use cases for analysing real-time data from CNC machines would be to send an instruction to a machine to do something based on the data received - for example, sending an instruction to stop the machine if the temperature of a certain component breaches a limit you previously set.
One-way limitations aside (we'll come back to that later) we still need to do something about grabbing the XML formatted data coming from the agent site. As per the demo we're following, the solution is a small console application (which the tutorial author has kindly provided). The program's purpose is simple - it consumes the XML from the agent site and converts a subset of that data into a JSON stream (which we can use more readily inside the Azure platform) and forwards it to a URL of our choice.
So, the only thing that needs to be changed in our console application code once we’ve downloaded it is the web address for the service that the data needs to go to – instructions are provided in the tutorial. The address we ned to use is the URL we copied at the end of step three.
STEP FIVE - CONFIGURING STREAM ANALYTICS
So, if we've configured our Collector Application correctly we should now have data flowing into our IoT Hub in Azure. There's a quick way to check that this is happening - simply go to the IoT Hub resource page and take a look at the Device to Cloud Messages graph. If your application has been running for a while, you'll be able to see some activity on the graph (switch to the 1 hour view if you've only just kicked the Collector application off).
All being well, we can continue with the creation of the Stream Analytics Job. This is really the core of the IoT project we're following. The stream analytics job is responsible for consuming the dataset we want from the IoT Hub, then splitting it out into the various outputs we're interested in (Power BI, Table Storage & Azure functions). The creation of the Job itself isn't too much trouble - again, the parameters outlined in the tutorial are sufficient to get things rolling.
As we can see from the diagram above, what we're really doing in this step is setting up the internal workflow within the Azure platform with regard to the data we're ingesting from the Console Application.
Why have we configured three outputs from our Stream Analytics job? Well, the tutorial is going down this road in order to help us understand that once we have the data in Azure, there are different options we have open to us for solving business needs.
- The Power BI output is about providing visibility of the streaming dataset to appropriate users (and potentially alerts could be triggered directly from there if necessary - for example, if a threshold for temperature is breached).
- The Table Storage output is used to collect the data (or a subset of it) in a flat table format. This might be used as a source of historic data for troubleshooting, or eventually as a Learning Data for a Machine Learning process to ingest.
- The Functions Queue output is being used here as an example of how particular follow-on tasks could be triggered by specific events that are spotted in the datastream. A good example might be that if an accelerometer breaches an emergency safety threshold, a shutdown of the machine could be triggered. As I mentioned earlier in this post, it's unfortunate that the MTConnect protocol is one-way. This means that any trigger to shut the machine down would have to either speak to the machine via a separate 'write' protocol (such as OPC-UA) or it could potentially be integrated with a separate system that controls power to the machine.
The important thing to note here is that the particular subset of data being sent to each of these outputs from the stream being ingested by the Stream Job is configurable (using a variant of SQL to create suitable queries for each output). In the tutorial example, we're sending just a specific type of event to the Function Queue, just Axis data to the Power BI output and everything to the Table Storage.
STEP SIX - OUR DATA IN POWER BI
So we're pretty much there now - if we've set up our Power BI Output in Azure correctly, then the dataset should be available in the instance of Power BI that we specified in the config.
For me, this was one of the trickiest parts of the tutorial. The dataset was available in the Power BI, but whenever I tried to create a streaming graph based on it, the graph wouldn't work.
I basically had two issues:
Firstly, my dataset wasn't using a datetime format that Power BI understood - I had to modify my SQL query to make it more specific.
Secondly, Power BI kept throwing errors related to receiving too much data from the stream. I therefore had to find a way of throttling the amount of data coming through - thankfully I found some documentation from Microsoft themselves which outlined how this could be achieved.
In the end.... Success!
It's worth noting that I deviated a bit from the tutorial from this point onwards (I wanted to examine a slightly different dataset in Power BI, so I modified the query which fed that output in Azure)
With regards to the Power BI Dashboard, I tried a couple of layout options - initially I tried combining the three axes (X, Y and Z) as a single dataset and plotting them on one chart - but as we can see the data for one of the axes is incomplete on the graph - I saw this kind of glitch quite regularly (along with the warnings about the input streaming too much data).
I therefore split my data out into three separate output streams in Azure (one for each Axis I wanted to observe):
This seemed much more stable, but I still saw the occasional glitch (definitely something I'd want to dig into in much more detail if this were a production build, and not a high level DX overview!).
STEP SEVEN - OUR SMS ALERTS
Again, we're straying slightly off the track of the core tutorial here, but only so that we can properly envisage the business benefits being generated.
In Step five, we configured our stream analytics to output to three sources - table storage (all events), Power BI (events associated with the X, Y and Z axes), and a separate query which was forwarding a specific type of event to a Service Bus Queue.
It's this last type that we'll look at now - the use case of having Azure Stream Analytics feed a queue, which is subsequently processed by Azure functions in order to drive an external application.
Our use case will be the following: Stream Analytics will separate out events relating to power status changes which happen during working hours (for what it's worth, I've no idea if this is actually an interesting event that anyone would be interested in - we're obviously faking it a bit here to highlight a potential scenario where an alert might be useful).
Events which match the criteria can be consumed by Azure Functions, which then triger a communication with an external service (for this scenario we could use Twillio - an API driven SMS gateway).
The ultimate result of this type of integration is the ability for certain types of event to trigger real-time alerts to specified end users:
Here's a mockup of the intended alert - as can be seen, the idea here is to enable businesses to react to real-time information being generated by IoT devices.
The example given is just to illustrate the concept - there are many more use cases where this type of functionality could be deployed, especially when data is blended with additional sources:
- Email alerts (plus CRM updates) to customers to indicate when their order is completed, based on the event stream for a distinct Order ID ending.
- Emergency shut down commands issued to API driven power sources (if available) when events which have been identified as dangerous are recorded.
- Updates to Stock management systems based upon each new 'job started' event.
- Alerts to facilities managers to investigate if the ambient temperature is too high for job in progress.
Those are just a few ideas (I'm sure that people who actually work in the manufacturing industry could think of many more) but as can be seen, the potential for increasing safety, quality and efficiency is huge.
THE BIG PICTURE
Let's take a final look at the business landscape in our IoT enabled business.
Cast your mind back to the audiences that we defined back towards the start of the post - do you think their expectations were met? Partially or wholly? How important are the additional audiences that have been defined in the big picture below? Are there others that I've missed? Feel free to leave your answers in the comments section at the end.
If you truly made it this far - Thanks for reading!
The tools I used for the blog post were:
Microsoft Azure (and the additional Storage Explorer application). Azure is at the core of the solution - I used several of it's out-of-the-box components to construct the solution (mostly config, only a tiny bit of code)
Visual Studio 2017 Community Edition. Used for making slight edits to the Console Application provided as part of the solution.
Microsoft Power BI (Online Edition). Note that Power BI comes in two flavours - a Desktop-based client and an Online web application. For dealing with steaming datasets, only the Online version can be used.
Lucidchart - for all of my diagrams.