Introduction

The combination of low cost devices which can be networked and the pervasivness of the Internet has created the capability to connect and control many things to the Internet so they can be controlled, read and managed. Examples are smart plugs, lights, blinds, health monitors. No one really knows the full potential of the Internet of Things (IOT's) today but what everyone agrees is that there will be lots of them. Current estimates are between 20-50 billion by 2020.

The full potential of all these IOTs will only be releasied by making them readily accessible to creative programmers. This requires platforms which can support the vast numbers of IOT's. There are some common characteritics of IOT's which need to be taken into consideration for this platform.

  • Connectivity is likely to be intermittant due to their mobility, power saving or enviroment.
  • Power budget and environmental impact is a concern both for the IOT and for the platforms needed to support them. Battery and solar powered IOTs are probably only going to be accessible for parts of the day.
  • Computing power of the IOT will be kept low to minimize cost and power usage. This may improve with Moore's law but we must assume the each IOT has minimal compute power, memory and storage capability.
  • There will be lots of different types of IOTs and the standards are still evolving. They will all support TCP/IP but a messaging layer such as MQTT is likely to be common. No doubt, other standards will emerge.

A Scalable IOT Service Platform

No one can predict today what applications will emerge to make use of IOT's. Their full potential will only be realized if application and service creators can each access and combine them. This will be accelerated by having a common platform and API's to access them, whatever their type. History has taught us that Metcalf's law prevails - "the value of the network is proportional to the square of the number of connected users". In this case it is the IOT's. We must avoid isolated islands of proprietary IOTs if we want awesome apps and services.

The platform to manage large numbers of IOT’s needs to meet the following requirements:

  • Able to connect to millions – billions of IOTs concurrently Able to continue working when network connectivity is lost or is only available for some IOTs sporadically
  • Able to represent state of the IOT even when the device itself is not accessible so that other applications that need this can always obtain it
  • Able to store commands for an IOT so they can be dispatched the next time a connection to it is reestablished
  • Able to pull and push data between each IOT and reliable scalable data stores
  • Entitlements and security policies to ensure data is secure and private by default with opt-in to selectively share where required
  • Encryption of all data at rest and on the public Internet
  • Support for 24x7 operation including during software updates
  • Able to be deployed on a wide range of hardware platforms including Cloud based IaaS platforms and low cost, potentially unreliable servers
  • Readily available and proven components including cluster management, asynchronous message bus, publish/subscribe, distributed processing, HA, distributed memory caches, scalable, partitionable data base support
  • A powerful an productive programming language able to support high levels of concurrency and distributed processing primitives with a strong User community, High levels of integration so that existing components written in any language can be incorporated into the management platform

There are many common area’s in a platform to manage IOTs and Cloud solutions for millions- billions of Users. The Cloud community has already created solutions to support over 1B Users. By leveraging the lessons learnt and components they have created to accomplish this we can accelerate the creation of a platform capable of supporting even larger numbers of IOT’s.

We believe an approach that uses an Agent to represent each IOT will be required. The Agent would be continuously available. It will provide the last known state of each IOT and save the commands that must be sent next time it is connected. It will enable other services which harness the data and control capabilities of the IOT to be created without each service having to cope with the intermittent connection and diversity of each IOT. The Agent also isolates the specific characteristics of each IOT from the service layers which want to utilize them.

Asynchronous messaging buses have proved highly successful in creating large scale Cloud services, they decouple the various components enabling each one to be independently developed and upgraded. Standardised solutions such as the ISO and OASIS Advanced Message Queuing Protocol (AMQP) which are wire level standards enable polymorphic applications to be created where each component can be created using a programming language optimized for its domain problem. Whilst AMQP is well supported and runs on devices such as the Raspberry Pi. Lightweight and low footprint messaging such as MQTT and STOMP can be run on devices in only a few KB of memory. Gateways to AMQP for these are already available.

What’s it take to manage 1B IOT’s

Initial estimates are that it will take between 500 – 5000 servers to manage 1B IOT’s. This assumes an Agent based solution to manage small groups of IOT’s. Because of TCP connection limits per server, the number of IOT’s is limited to 1-10 million per server. If we assume that each Agent has a memory footprint of 20MB when active and 24KB when idle, and a normal mix of 90% active, then the memory requirement per server for Agents alone is around 15MB. Control processes, storage increases this to 32GB.

These metrics are broadly similar to those we see on the Internet today, e.g. WhatsApp provides a multimedia media messaging service for around 1B users, handling over 7million msg/sec. This currently uses over 1000 Servers.

Advantage of Agent based models

IOT’s are deployed on highly distributed and inherently unreliable networks. Agent based solutions make the last known state and ability to accept commands constantly available, they centralise access controls and entitlements so that they can be managed according to polices. They simplify the control of IOT’s since the Agent is always available. This makes it easier to create solutions which are each accessing hundreds of IOT’s.

Erlang is an Actor based programming language that has high levels of concurrency that simplify the creation of Agent based solutions. Erlang was created by Ericsson in Sweden and is used today to manage a large proportion of the world’s mobile phone networks. To accomplish this they created a the Open Telephony Platform (OTP) which added the necessary run time services to accomplish this in a highly reliable and 24x7 environment. Erlang is increasingly used to create scalable Cloud applications. It has proved itself to be a highly productive environment where an application can be written to run on a single server and then, with minor modifications scaled out to dozens of servers.

Environmentally friendly scalability

Running 1000 servers has an environmental cost. We anticipate that the population of IOT’s will be very dynamic making it difficult to predict in advance how many servers are required to support them. The cloud platform must dynamically rightsize, ensuring that just enough servers to support them are powered up and working, but spun down or powered off when no longer required. Existing IaaS capabilities such as OpenStack and AWS make it easy to spin up servers or instances on demand, saving costs and environmental impact.

A cluster of servers is partitioned into different application groups and a free pool of unallocated servers. The Applications in each partition are monitored. When the load increases, new servers are allocated from the free pool, when load drops they are returned. The number of servers in the free pool is further managed so that it is kept to a small size, when it becomes too large, servers are powered off, or in the case of a IaaS service the instance is shut down. This approach ensures that sufficient compute resources are available whilst avoiding the cost and environmental impact of running too many.

Different partions are created of servers with different characteristics. Large disks and lots of memory for data storage, high core count and reasonable memory for the mid tier and lots of small micro servers for the web access tier

Erlang has a portable run time which is available on many processor architectures and most Operating Systems. It uses a Virtual machine for portability and soft real time behaviour makes it very responsive and predictable. It uses an Actor model and lightweight processes to facilitate Agent based solutions. It was designed for concurrency with immutable variables and asynchronous messaging built in. This simplifies the creation of distributed programs.

Erlang is supplemented by the Open Telecom Platform (OTP) which was created by Ericsson as the basis of all its mobile phone network management. This simplifies the creation of highly reliable, scalable services running in 24x7 environments. OTP provides many required components including generalised supervisor patterns including standard behaviours for reliable services, Finite State machines, Event manager. A Distributed, database stored disk and/or in memory and with support for transactions. Release control, live updates, profiling, code inspectors etc. It has proven reliability over many years running mobile carrier networks

Whilst we believe Erlang OTP is the best fit for the core services, the platform needs to easily integrate components developed in any language. A message bus based on AMQP enables this. AMQP is a ISO/OASIS wire level protocol with client library support in almost all programming languages and with off the shelf components for node.js and other popular web applications. The message bus enables polyglot applications, applications written in any language to co-exist. We have already created Cloud apps based on this design where we used RabbitMQ. This includes a powerful gateway model which supports AMQP to MQTT.

We have also used the strong concurency and communications capabilities of Erlang to create control planes which run across a distributed environment and orchestrate existing apps or languages to be incorporated. Erlang has rich integration capabilities enabling it to interface directly to Java and ‘C’ programs. This simplifes those programs by leaving the concurrency and multi-threading to Earlang where it is easier, and safer to program.