Anjul Bhambhri, VP-Big Data, IBM
Big Data is no longer “hype,” but a reality. Customers are progressing from experimentation in small R&D groups to implementing use cases and taking applications into production. Companies are identifying the gaps in the industry’s Big Data and analytics offerings, which are quickly being filled by the open source movement and enterprise vendors. For example, look at the quick sequence of capabilities that have been released around SQL, metadata management, encryption, interactive query processing, and analytics.
Also, Hadoop is evolving into a heterogeneous computing and storage platform. The platform will continue to evolve and will become the next generation data warehouse and visualization base, bridging the gap between transaction processing and analytics needs. The industry demand will shift from a data landing zone, serving downstream warehouses and data marts, to a true polystructured warehouse with large scale in-place analytics capabilities, built into the warehouse substrate. This will drive the need for a variety of storage engines on that warehouse base: key value, columnar, hierarchical, graph, document, JSON, and dimensional, accessed by a common set of interfaces ranging from SQL, REST API, Document API, Link traversal, and Search.
Over the next few years, I see a few key trends in the Big Data space:
• The Hadoop platform will mature into a heterogeneous computing platform for batch, interactive, analytical and real-time workloads.
• The maturation of models like Docker and OpenStack will make it easier to set up and manage large clusters. • More Big Data implementations will move from being on premise to the cloud.
• Companies will have the ability to write and run analytics algorithms across different architectures, exploiting MapReduce, Spark and even SQL.
• Finally, the rise of hybrid Big Data platforms will force a cultural shift within the enterprise. Organizations will become much more data driven, applying insights to everything from key business process and decisions to the way they fundamentally operate. Rather than relying on decisions based on gut feelings, businesses will infuse analytics into everything that employees touch (management systems, machine to machine processes, daily decisions & tasks, etc.) to develop data-driven and evidence-based cultures and workforces.
The following are critical success factors for the success of Big Data projects:
• The availability of skills to design and implement large scale projects that cuts across data management, and analytics.
• Breakthroughs in visualization for data discovery, correlation and exploration of data involving complex data.
• Built-in governance capabilities like encryption, masking, lineage, life cycle management, auditing in the Big Data infrastructure.
Criteria to evaluate Big Data outsourcing vendors
Whether you are evaluating Big Data development in-house or via an outside vendor, the most important factor to consider is longevity. Your strategy should not be based on short-term goals, but instead focus on long-term business growth. Companies need a technology infrastructure that will support the Big Data and analytics capabilities of the future, which will ensure your business’ growth and success.
Taking a “platform” approach is key, as it allows users to address the full spectrum of Big Data challenges. Quickly emerging as the world’s newest resource for competitive advantage, organizations are building platform-based analytics approaches.
Your platform needs a combination of a cluster management, a domain specific application development expertise and a data science team for a successful deployment. Some enterprises with strong in house development skills can corral this. Most others have skills that are more domain-specific (like retail, pharmaceuticals, clinical trials, fraud) and cannot handle the scale and complexity of the deployment or might lack data science teams. Therefore, the key factors to evaluate and decide on doing in-house or partnering are:
• Are you skilled in handling a complex data center operation? If not, you should look at a managed services provider either on a cloud infrastructure or on premise.
• Do you have experience handling a warehouse infrastructure with associated data management, reporting, governance and analytics capabilities? If not, you should get a solution integrator or a service bureau that has the experience in designing and implementing a Big Data platform with appropriate tools and governance.
•Ensure you have access to the right set of data and information discovery tools, as well as visualization tools that can run on the Big Data platform you are choosing.
• Do you have quants and data scientists in house? If not, gather a small team of data scientists that understand your domain. This can be done in-house or even as a service via specialist providers.
• Establish a center of competency around Big Data that can deploy the right platform with a focus on gaining actionable insights.
The biggest challenges towards successful deployment of Big Data projects
The key to Big Data success is thinking long-term. Companies should build a strategy that focuses not only on how to start using Big Data, but thinks about how to grow with Big Data. While you must consider your longterm strategy, it is best to adopt Big Data and analytics in small chunks. While Big Data projects can transform the enterprise, the implementation must proceed in The first step for companies who want to leverage Big Data is to identify team members with the skills to dive into the data, whether it’s through internal searching or bringing in new, data-skilled workers. Establishing a small, data-focused team is a good way to build skills, best practices around platform and analytics, and application development tools that are relevant to the industry your company operates in.
Verticals that will benefit the most from Big Data and Analytics
Big Data is being adopted across industries. Big Data and analytics technologies are used in industries as varied as telco, retail, healthcare, e-commerce, hospitality, the auto industry, energy and utilities and more.
For instance, Emory University Hospital is using streaming analytics technology as part of a research project aimed at developing advanced, predictive care for critical patients. The system can identify patterns in physiological data and instantly alert clinicians of danger signs in patients within the ICU.
In the auto industry, PSA Peugeot Citroën is using Big Data and Analytics and mobile solutions to integrate and analyze the massive amounts of data from cars, phones, traffic signals, lights and other sources to launch a new era of connected vehicles. The French car manufacturer plans to offer a range of “connected services” to its clients, allowing them to use numerous access channels such as websites, vehicle data, customer service or mobile applications. Drivers will also be able to access other sources of vehicle data and customer service information such as better weather precision via onboard sensors of temperature, lights, and windshield wipers from Peugeot Citroen's cars.