Globally, organizations are focused on harnessing the power of Big Data. A first step in the journey is typically to develop an understanding of the difference between Big Data and the more traditional data environment. Gartner, in 2012, described Big Data based on three Vs–Volume, Velocity, and Variety. Volume, the scale of the data captured, is a key component in describing a Big Data platform. According to a recent IBM study, every day the world now creates over 2.5 trillion gigabytes of data, providing a potentially massive level of data for Big Data platforms. Velocity, the frequency of the data captured, continues to increase, driven by the rapid expansion of mobile phones, sensors, and other connected devices which produce an endless stream of data. Big Data has moved us from a world of structured transaction data into one that captures a wide variety of data ranging from social media content to health data from wearable devices.
Once executives understand the difference between Big Data and the traditional data environment, the next step in the journey frequently becomes the selection of the physical platform. In the mid-1990s, the advent of the data warehouse caused a similar response. As organizations recognized the need to have a distinct platform for informational analysis to prevent impact on the transactional environment, the focus immediately shifted to the technology selection process. At this point in the adoption of Big Data, as was the case with data warehousing, it is important to avoid becoming a Field of Dreams casualty. Those organizations that move too quickly to select and build a Big Data platform may have adopted a strategy from the 1989 movie, “build it and they will come”. Similar to the failed data warehouse projects of the past, building a Big Data platform without first identifying specific business value can cause your organization to become another victim.
The original three Vs are a good way to define the concept of Big Data; however, I encourage you to consider three additional extremely important Vs in the implementation of Big Data in your organization–Value, Validity & Vitality. To prevent the situation where a Big Data platform is built but not used, the business value must be understood in advance, the validity of the solution must be sound for business adoption, and finally, the organization must have the vitality to carry through the implementation process. Let’s examine each of these areas more thoroughly.
"The creation of a Big Data platform to ingest, analyze, and produce a custom policy for each customer has the potential to deliver tremendous value when properly supported by the key business areas"
Recently, I have listened to a number of presentations on Big Data where the key selling point presented was “Our Big Data solution is great since you can load all of your data on low cost commodity servers, with disk storage costing nearly nothing, in any format the source data exists, and figure out how to use it later”. Don’t let the appeal of this approach cause your organization to become the next Field of Dreams casualty. A key success factor in a Big Data journey starts with a clear vision as to how it will deliver business value. For example, it is possible today to capture streaming data from thousands of sensors in vehicles, ranging from braking levels to speeds traveled while in turns. Properly used, this data can allow an insurance company to align the price of a policy to the driving behaviors of the customer. The creation of a Big Data platform to ingest, analyze, and produce a custom policy for each customer has the potential to deliver tremendous value
when properly supported by the key business areas.
My next suggested focus area is the validity of the solution. Technically, a properly configured Big Data platform can easily handle the streaming sensor data from nearly an unlimited number of vehicles. As a result, the solution can be created and it has the potential to drive significant business value, but is it valid? Before the insurance company in our example heads down the path of implementing this type of solution, several areas must be explored. Is it legal to price an auto policy based on he driving patterns of an individual versus a group within the state where the insurance company operates? Does the business area responsible for pricing have the ability to set and administer prices at this level and pace? With the increasing consumer focus on privacy, will customers agree to this level of
monitoring in exchange for more targeted pricing? Validity may become more interesting and important once an initial solution is operational. With driving
patterns for a large number of customers captured, other potential business value areas may be identified. For example, would it be valid for the insurance company to sell the location and behavior data insights to other companies for use in location based marketing? The business must address questions such as these, which go beyond the technical feasibility of the Big Data solution.
Finally, an organization must determine if it has the vitality to implement the solution. The implementation of the first Big Data solution within an organization will
be difficult. Success will depend on the selection of strong partners, cross training existing resources, and potentially adding new resources. Currently, the availability of skilled Big Data resources is presenting a challenge.
The level of maturity of the software and applications in the Big Data area is rapidly improving, but it is still much lower than standard data and application development solutions. Big Data solutions in some respects are similar to electric cars. The electric car has tremendous energy efficiency, very high torque, and fast acceleration, but is still experiencing pain points such as limited charging stations, high battery costs, and concerns over driving distances. Big Data platforms, such as Hadoop, have shown the ability to support massive data volumes, with structured and unstructured modules that can address a multitude of data. However, like the electric car, the platform is experiencing pain points such as limited available resources, emerging development tools, and concerns about integration with the existing environment.
As you begin your Big Data journey leverage Volume, Velocity, and Variety to drive a better understanding around the potential, but do not forget Value, Validity, and Vitality as it relates to successful execution.