Everyone knows business analytics is one of IBM’s (news, site) main market segments and with growing volumes of data and the need for extended storage space, analytics is essential. As a result, IBM has just unveiled a new storage architecture.
Designed for data-intensive tasks like financial analytics, digital media use and data mining, the new General Parallel File System-Shared Nothing Cluster (GPFS-SNC) will take hours off complex analytics, without heavy infrastructure investments.
IBM and Big Data
IBM has been playing the Big Data field for quite some time, and with analytics generally and analytics in real time rapidly becoming a key business differentiator, IBM has been under pressure to provide better and bigger data analysis.
Leaving aside acquisitions related to extending its business analytics portfolio, IBM has also made other advances this year.
In May, in the wake of the closure of the NISC and Initiate deals, it announced new services and products called IBM Info Sphere BigInsights to help companies analyze Big Data.
IBM Info Sphere BigInsights
IBM Info Sphere BigInsights moved analytics up a notch by enabling enterprises to analyze petrabytes of data using Apache Hadoop, an open source technology developed specifically for the Big Data end of the datasphere.
Simply put, Big Data are datasets that grow so large they become awkward to work with using conventional database management tools, creating difficulties in the areas of capture, storage, search, sharing, visualizing and analytics.
While IBM has a number of giant-size facilities all over the world to manage Big Data, it makes better financial sense to rearrange the architecture to keep up with growing data requirements, as well as tackle the workload flexibility through the rapid provisioning of system resources for different types of workloads.
Big Data Architecture
The GPFS-SNC architecture does this by providing advanced clustering technologies, dynamic file system management and advanced data replication techniques.
It is a ‘share nothing’ distributed architecture, where each node is self-sufficient with tasks divided up between independent but linked computers so that individual tasks can be achieved without having to wait for the rest of the cluster to finish their tasks.
IBM's current GPFS technology offering is the core technology for IBM's High Performance Computing Systems, IBM's Information Archive, IBM Scale-Out NAS (SONAS) and the IBM Smart Business Compute Cloud.
While the details of the new architecture have been released, it is not clear when it will be implemented in Big Data projects, although the need is such that we will probably be seeing it soon.