Big data is a relative term describing a situation where the volume, velocity and variety of data exceed an organization’s storage or compute capacity for accurate and timely decision making.
Enhancing BI with big data
Big data techniques complement business intelligence (BI) tools to unlock value from enterprise information. Whereas BI traditionally performs structured analysis and provides a rear-view mirror into business performance, big data analytics provides a forward-looking view, enabling organizations to anticipate and execute on opportunities of the future. Simple reporting, spreadsheets and even fairly sophisticated drill-down analyses have become common place expectations of BI. However, there are types of analyses that BI can’t handle, particularly when data sets become increasingly diverse, more granular, real-time and iterative, requiring organizations to capture in-depth information from a specific moment in time before conditions change rapidly. These types of unstructured, high volume, fast-changing data big data breaks the relational database model. Such data requires a new class of technologies and analytic methods to extract value. For example, big data approaches are essential when organizations want to engage in predictive analysis, natural language processing, image analysis or advanced statistical techniques such as discrete choice modeling and mathematical optimization or even if they want to mash up unstructured content and analyze it with their BI mix.
Big data in the cloud
Cloud models tame big data while extracting business value from it. This delivery model gives organizations a flexible option for achieving the efficiencies, scalability, data portability and affordability they need for big data analytics. Cloud models encourage access to data and provide an elastic pool of resources to handle massive scale, solving the problem of how to store huge data volumes and how to amass the computing resources required to manipulate it. In the cloud, data is provisioned and spread across multiple sites, allowing it to sit closer to the users who require it, speeding response times and boosting productivity. And, because cloud makes IT resources more efficient and IT teams more productive, enterprise resources are freed to be allocated elsewhere.
Cloud services specifically designed for big data analysis are starting to emerge, providing platforms and tools designed to perform analytics quickly and efficiently. Companies who recognize the importance of big data but don’t have the resources to build the required infrastructure or acquire the necessary tools to exploit it will benefit from considering these cloud services.
Information Management for Big Data
Many organizations already struggle to manage their existing data. Big data will only add complexity to the issue. What data should be stored and how long should we keep it?
What data should be included in analytical processing and how do we properly prepare it for analysis? What is the proper mix of traditional and emerging technologies?
Big data will also intensify the need for data quality and governance, for embedding analytics into operational systems and for issues of security, privacy and regulatory compliance . Everything that was problematic before will just grow larger.
Heights provides the management and governance capabilities that enable organizations to effectively manage the entire life cycle of big data analytics, from data to decision.
Heights provides a variety of these solutions, including data governance, metadata management, analytical model management, run-time management and deployment management.
With Heights , this governance is an ongoing process, not just a one-time project. Proven methodology-driven approaches help organizations build processes based on their specific data maturity model.
Heights Information Management technology and implementation services enable organizations to fully exploit and govern their information assets to achieve competitive differentiation and sustained business success.
Three key components work together in this realm:
• Unified data management capabilities, including data governance, data integration, data quality and metadata management.
• Complete analytics management, including model management, model deployment, monitoring and governance of the analytics information asset.
• Effective decision management capabilities to easily embed information and analytical results directly into business processes while managing the necessary business rules, workflow and event logic.
High-performance, scalable solutions slash the time and effort required to filter, aggregate and structure big data . By combining data integration, data quality and master data management in a unified development and delivery environment, organizations can maximize each stage of the data management process.
Big Data Technologies
Accelerated processing of huge data sets is made possible by four primary technologies:
Grid computing : A centrally managed grid infrastructure provides dynamic workload balancing, high availability and parallel processing for data management, analytics and reporting. Multiple applications and users can share a grid environment for efficient use of hardware capacity and faster performance, while IT can incrementally add resources as needed.
In-database processing : Moving relevant data management, analytics and reporting tasks to where the data resides improves speed to insight, reduces data movement and promotes better data governance. Using the scalable architecture offered by third-party databases, in-database processing reduces the time needed to prepare data and build, deploy and update analytical models.
In-memory analytics : Quickly solve complex problems using big data and sophisticated analytics in an unfettered manner . Use concurrent, in-memory, multi-use access to data and rapidly run new scenarios or complex analytical computations . Instantly explore and visualize data. Quickly create and deploy analytical models . Solve dedicated, industry-specific business challenges by processing detailed data in-memory within a distributed environment, rather than on a disk.
Support for Hadoop : You can bring the power of Heights Analytics to the Hadoop framework (which stores and processes large volumes of data on commodity hardware). Heights provides seamless and transparent data access to Hadoop as just another data source, where Hive-based tables appear native to Heights . You can develop data management processes or analytics using Heights tools while optimizing run-time execution using Hadoop Distributed Process Capability or Heights environments. With Heights Information Management, you can effectively manage data and processing in the Hadoop environment.