EMC Big Data

EMC Big Data

Heights provides the vision, talent and technology so you can act now and get results from Big Data to better understand customer behavior, optimize operations, manage risk and innovate.

Big data is not a precise term rather it’s a characterization of the never-ending accumulation of all kinds of data, most of it unstructured. It describes data sets that are growing exponentially and that are too large, too raw or too unstructured for analysis using relational database techniques. Whether terabytes or petabytes, the precise amount is less the issue than where the data end up and how it is used.images (23)

When used correctly, big data can yield insights to develop, refine or redirect business initiatives; discover operational roadblocks; streamline supply chains; better understand customers; as well as develop new products, services and business models.

While the usefulness of big data may be clear, the path toward big data productivity is not. Successfully leveraging big data insight requires a real investment in proven technologies, updated workforce skills and leadership focus. Organizations must combine three facets of strategy technical, organizational, cultural in order to implement a big data platform that suits the business and its objectives.

Heights Big Data Road Map

Strategies for dealing with big data challenges will differ depending on the “data maturity” of the organization. How efficiently and effectively can data be collected for analysis and for that matter, is the organization aware of all of the different types and sources of data that should be included to maximize insight and answers? How well can the organization reconcile different data formats? What is the cost of collecting and analyzing data and how is that cost weighed against the anticipated value of the outcome?

1. Dimensionalize your data mix

As businesses progress along this data learning curve, they can begin exploring new uses and combinations of data. This means collecting new types of data, adding new sources of data to existing sets and combining sets to create new value and insights.

images (25)Business workers should be encouraged to use their imagination to test hypothesis and validate or disprove hunches with the help of big data. IT should also get creative in pioneering new ways to collect, partition and combine data so that insight is unveiled and action can be taken.

2. Prepare to act on what’s learned

All information-driven insight regardless of whether it’s from a team of management consultants or from big data analytics is only as valuable as what is done with it. Big data offers organizations the opportunity to derive detailed, timely insights and act on them with greater speed and agility than ever before. For instance, analyzing social media data could uncover customer behaviors to customize promotions and offers presented to certain customers. Achieving this type of real-time responsiveness to opportunity will require organizations to become far nimbler about how they manage business processes and workflows. Business leaders must set expectations of action and charge managers with injecting data-driven discoveries into how their teams work. Council members concede that creating the organizational flexibility to adapt may be the toughest challenge of all.

3. Pinpoint data value

Bring together line of business leaders and IT practitioners to identify which data pools have the greatest value. Evaluate the data stores that have been prepared for analysis and consider how they could be expanded or improved and look at unstructured data sets and prioritize which ones should be converted to more usable formats. Business leaders and IT must also work together to highlight use cases for the data based on existing business to determine which approaches will yield the most business value in the shortest window.

4. Control data hygiene

Starting with the information already stored in corporate IT architectures, organizations should clean up existing data stores to prepare them for this new form of analysis. Basic block and tackle “data hygiene” such as compressing, de-duping and archiving old files will streamline storage, enable outdated systems to be retired and make it easier to identify data stores that need to be brought up to date. Also, integrating data wherever possible, implementing a data tagging system and training IT staff in partitioning data are important parts of the preparation process.

Infrastructures for Big Data

Massive quantities of data are difficult to move swiftly back and forth over today’s network connections. Big data infrastructures must distribute compute power so that data analysis can be done close to the user to avoid the latency inherent in crossing networks. Distributing this kind of compute power to fuel analysis tools and provide real-time responses will pose challenges as organizations realize that analysis may well need to occur at the place where the data resides. Council members see a trend in which the velocity and volume of data make it impractical to move data back and forth for processing. Instead, computation and images (26)analytics will likely move to the data. Furthermore, council members see cloud computing models becoming essential to the success of big data projects.

But storing and serving up the data is not enough it must be synthesized, analyzed and correlated in new ways in order to deliver business value. Some big data techniques require working with data that hasn’t been modeled by data architects, allowing for the comparison of different types of data and pattern matching across disparate data sources. This allows big data analytics to provide new perspectives on traditional corporate data and yield insights into data that traditionally has not been analyzed. Tools such as Hadoop, open-source technology that distributes data-analysis workloads across many computers to break analysis into many parallel workloads and produce results faster, are essential. Commercial tools are still nascent, as big data is still a fairly new phenomenon. As a result, most of the software programs used today for big data analytics are purpose-built and developed in-house using open source tools created by the Apache Software Foundation, Google, Yahoo and others.

With the entirely new rates of data growth happening, everything must be re-evaluated. Enterprises are making significant investments in new infrastructure to capture, store, aggregate, manage, govern and analyze data an undertaking that must be approached holistically and with big data analytics in mind. To accommodate the data itself, IT infrastructures must be able to inexpensively store higher volumes and more types of data than ever before. Data velocity, meaning the speed at which data changes, must also be accommodated.