Reality is beginning to bite the Internet of Things (IoT). After months of enthusiastic discussion about  the opportunities it will provide and how much it will be worth, many of those looking to play in the IoT space are starting to look at the potential problems, including data management. 

Though everyone knows managing data will be a problem once the IoT is up and running at full scale, few have really considered the potential data storage problems.

The Problem With Data

Sure there have been hypothetical discussions around compliance, privacy or the kind of information that consumers will be happy to offer to businesses in exchange for better customer experiences. But there has been little discussion around the subject of where exactly enterprises plan to store the massive amounts of data that will be created.

Think about it. According to research from Gartner, there will be an estimated 26 billion units installed globally by 2020 with many more on the way in succeeding years as the price of processors drops. In the near future, it will be feasible to install a processor into just about everything.

Where is all the data provided by those processors going to be stored and what are the problems around them?

This is not just a brainteaser. It is a very practical and real problem. After all, if enterprises are to get the bountiful insights into customer activity like the  IoT promises, they are also going to have to keep all that information somewhere while it is being analyzed.

In a recently published paper from Garter entitled The Impact of the Internet of Things on Data Centers, Gartner identifies the principal issues that are going to have to be resolved before enterprises can start to benefit from the IoT. Fabrizio Biscotti, research director at Gartner, summarized the problem as follows:

IoT deployments will generate large quantities of data that need to be processed and analyzed in real time. Processing large quantities of IoT data in real time will increase as a proportion of workloads of data centers, leaving providers facing new security, capacity and analytics challenges.”

Connecting Remote Assets

The problem lies in the nature of the IoT itself. It will connect remote devices and systems and provide a data stream between devices and decentralized management systems. The data or even the devices will be incorporated into existing organizational processes to provide information on the location, status, activity and functionality of those systems, as well as information about the people who own and operate them.

The amount and type of information differs than other sets of big data that comes from social media, for example, in the following ways:

  • It tends to arrive as a steady stream and at a steady pace, although it can arrive in batches like test logs that can be processed and passed on straight away
  • It comes in very large quantities and accumulates very fast
  • The real value can only be uncovered using analytics
  • It is rarely used for production purposes
  • It is deleted very quickly, unless it is needed for compliance reasons

The IoT Data Challenge

The technologies to address the big data challenge already exist, like Hadoop or NoSQL, providing horizontal scalability, high capacity and parallel processing at prices that make them affordable and economical.

For the moment, though, IT departments in enterprises have not had to deal with IoT data as a unique dataset in its own right. For the moment at least, the first sets of what will make up IoT data are arriving in the storage layer in the same way other unstructured data does.

The result is that traditional storage architecture and management software can treat IoT data the same way as they treat other unstructured data.

However, this is all changing rapidly. With the development of wearables for consumers and the emerging use of smart machines the portion of IoT as a subset of big data will grow quickly forcing enterprises to think their infrastructure to enable scalability and to make them cost effective.

With these changes come seven different challenges that enterprises, and in particular IT departments will have to manage:

The enormous number of devices, coupled with the sheer volume, velocity and structure of IoT data, creates challenges, particularly in the areas of security, data, storage management, servers and the data center network, as real-time business processes are at stake," Joe Skorupa, Gartner vice president said.

Gartner has identified several challenges:

1. Security

If the digitalization and automation of millions of devices will create a whole new security landscape as enterprises attempt to protect themselves, it will also create new opportunities for operational technology security providers.

Already, many industry-specific security platforms are being developed for specialist areas like industrialized systems, medical equipment, and air and defense sectors and, in many cases, being integrated into the platforms being developed by equipment providers for those industries. Such solutions are aimed at securing various aspects of specific devices, such as smart meters, or focusing on tackling platform-specific vulnerabilities,

2. Enterprises

There will also be significant security challenges from the increasing amount of data with the myriad of devices increasing security complexity. This, in turn, will have an impact on availability requirements, which are also expected to increase, putting real-time business processes at risk.

3. Consumer Privacy

Related to this is the challenge of securing the personal data of individuals as the consumer goods they use become increasingly digitized. Already there are issues around metering equipment and digitalized automobiles.

This is particularly challenging as the information generated by IoT is a key to bringing better services and the management of such devices. Security will have to be integrated as part of IoT infrastructure.

4. Data

The impact of the IoT on storage is two-pronged in types of data to be stored: personal data (consumer-driven) and big data (enterprise-driven). Already in use in key verticals such as healthcare and financial services, big data is transforming how and why companies collect and store data.

IT administrators that are already tasked with keeping the storage centers running, will also have to figure out how to store, protect and make all the incoming data accessible. If, as Gartner, estimated, storage servers are only being used to between 30 and 50 percent of capacity, the physical capabilities are there. Managing them, however, is an entirely different problem.

5. Storage Management

However, even if the capacity is available now, there will be further demands made on storage and one that will have too be addressed as the need too access this information becomes more important. Businesses will have weigh up the economics of storage against the value of IoT information.

6. Server Technologies

The impact of IoT on the server market will be largely focused on increased investment in key vertical industries and organizations related to those industries where IoT can be profitable, or add significant value.

Some organizations that manage and consume data collected from a huge array of devices will require additional compute capacity and may well increase server budgets if there is a business case for it.

7. Data Center Network

Existing data center WAN (Wide Area Network) links have been built for moderate-bandwidth requirements created by our current use of technology. However, as the amount of data being transferred is set to increase dramatically, the need for expanded bandwidth grows.
The result of all this, the research points out, is that because of the scale of the data being created it will no longer be economically feasible to store data at a single location.

The Implications For Enterprises

This flies in the face of a trend in recent years to centralize applications in a single center to reduce costs and enhance security. The result is that enterprises will be force do aggregate data to multiple distributed data centers where processing of that data can take place.

This implies a re-architecturing of the systems that are managing data as well as a more comprehensive strategy around the way we store data, and the kind of data we store.

On top of this up this, the volume of data will create potentially insoluble governance issues, such as network bandwidth and remote storage bandwidth, and capacity to back up all raw data is likely to be unaffordable.

Consequently, organizations will have to automate selective backup of the data that they believe will be valuable/required. This sifting and sorting will generate additional big data processing loads that will consume additional processing, storage and network resources that will have to be managed.

The overall impact of this is that data center operations will need to start planning their data strategies with the impact of distributed data centers written into the equation.

A number of companies have already started the process with the likes of OpenText, for example, announcing plans for a center in Australia, while global gorillas like IBM continue to build up the number of data centers they own and control with regular announcements about data center opening across the planet.

This is only the beginning of process that will see major changes in the design and architecture of data infrastructures that are required to make the IoT a reality.

Title image by Stuart Monk (Shutterstock).