Friday, December 4, 2015

An introduction to ESBs for Data Integration


From ETL Tools to ESBs
In the IT landscape, ETL (extract, transform, load) processes have long been used for building data warehouses and enabling reporting systems. Using business intelligence (BI) oriented ETL processes, businesses extract data from highly distributed sources, transform it through manipulation, parsing, and formatting, and load it into staging databases. From this staging area data, summarizations, and analytical processes then populate data warehouses and data marts.
Most certainly, ETL tools have their place in the IT environment, as numerous database admins utilize ETL tools to facilitate process and deliver optimal value to business.
  • Data Warehousing: Historically, the primary use for ETL tools has been to enable business intelligence. Pulling databases, application data and reference data into data warehouses provide businesses with visibility into their operations over time and enable management to make better decisions.
  • Data Integration: Data integration allows companies to migrate, transform, and consolidate information quickly and efficiently between systems of all kinds. ETL tools reduce the pain of manually entering data and allow dissimilar systems to communicate, all the while supplying a unified view.

ETL Tools Get Complicated

ETL tools indeed provide a method of communication between databases and applications, but pose significant challenges over time. Because creating this type of connectivity requires an comprehensive knowledge of each operational database or application, interconnectivity can get complicated as it calls for implementing very invasive custom integrations.
Over time, this approach grows increasingly complex, and the greater the number of interconnected systems, the more complicated things become. Moreover, with such tight coupling, interdependencies create the potential for big, unpredictable impacts when even the slightest changes are made. The custom point-to-point data-level integrations become a tangled web of brittle connections, quickly beginning to look like “spaghetti code”.

APIs and ESBs Simplify Data Integration

The increase in popularity of APIs has also made it much easier to create connectivity. With APIs, developers can access endpoints and build connections without having in-depth knowledge of the system itself, simplifying processes tremendously. As ETL tools remained focused more towards BI and big data solutions, and as traditional operational data integration methods become outdated with the rise in popularity of cloud computing, ESBs become better options to create connectivity.
An enterprise service bus (ESB) provides API-based connectivity with real-time integration. Unlike traditional ETL tools used for data integration, an ESB isolates applications and databases from one another by providing a middle service layer. This abstraction layer reduces dependencies by decoupling systems and provides flexibility. Developers can utilize pre-built connectors to easily create integrations without extensive knowledge of specific application and database internals, and can very quickly makes changes without fear of the entire integrated system falling apart. Shielded by APIs, applications and databases can be modified and upgraded without unexpected consequences. In comparison to utilizing ETL tools for operational integration, an ESB provides a much more logical and well defined approach to take on such an initiative.
Some of the commonly used ESBs in the Software Industry
1. Oracle Service Bus (OSB)
2. Mule ESB
3. Fuse ESB
4. Talend ESB

Data Integration in a nutshell


Data integration is the combination of technical and business processes used to combine data from disparate sources into meaningful and valuable information. A complete data integration solution delivers trusted data from a variety of sources.

Various data integration solutions help you understand, cleanse, monitor, transform and deliver data so you can be sure the information is trusted, consistent and governed in real time.






Data Integration Areas

Data integration is a term covering several distinct sub-areas such as:
  • Data warehousing
  • Data migration
  • Enterprise application/information integration
  • Master data management

Data Integration Techniques

There are several organizational levels on which the integration can be performed. As we go down the level of automated integration increases.
Manual Integration or Common User Interface - users operate with all the relevant information accessing all the source systems or web page interface. No unified view of the data exists.
Application Based Integration - requires the particular applications to implement all the integration efforts. This approach is manageable only in case of very limited number of applications.
Middleware Data Integration - transfers the integration logic from particular applications to a new middleware layer. Although the integration logic is not implemented in the applications anymore, there is still a need for the applications to partially participate in the data integration.
Uniform Data Access or Virtual Integration - leaves data in the source systems and defines a set of views to provide and access the unified view to the customer across whole enterprise. For example, when a user accesses the customer information, the particular details of the customer are transparently acquired from the respective system. The main benefits of the virtual integration are nearly zero latency of the data updates propagation from the source system to the consolidated view, no need for separate store for the consolidated data. However, the drawbacks include limited possibility of data's history and version management, limitation to apply the method only to 'similar’ data sources (e.g. same type of database) and the fact that the access to the user data generates extra load on the source systems which may not have been designed to accommodate.
Common Data Storage or Physical Data Integration - usually means creating a new system which keeps a copy of the data from the source systems to store and manage it independently of the original system. The most well know example of this approach is called Data Warehouse (DW). The benefits comprise data version management, combining data from very different sources (mainframes, databases, flat files, etc.). The physical integration, however, requires a separate system to handle the vast volumes of data.
Some of the well known Data Integration vendors, tools and software solutions are listed below:
Data Integration Solutions Review - Actian Pervasive
Data Integration Solutions Review - Adeptia
Data Integration Solutions Review - Clover ETL
Data Integration Solutions Review - Dell Boomi
Data Integration Solutions Review - IBM
Data Integration Solutions Review - Informatica
Data Integration Solutions Review - Microsoft
Data Integration Solutions Review - Oracle
Data Integration Solutions Review of Pentaho
Data Integration Solutions Review - SAP
Data Integration Solutions Review - SAS
Data Integration Solutions Review - SnapLogic
Data Integration Solutions Review of Talend

Wednesday, February 4, 2015

Business Analytics and Business Intelligence in a nutshell

Business Analytics gathers insights from the data that help users to take various business decisions and further used for automatizing and optimizing business processes.  It has no clear definition but according to Wiki, BA refers to the skills, technologies, practices for continuous iterative exploration and investigation of past business performance to gain insight and drive business planning. Successful Business Analytics depends on high quality data, highly skilled analysts who understand the technologies and the business, and analysts having the commitment for correct decision making for the organization.
Examples of BA:
·    Data Mining (exploring the data and finding out various patterns)
·    Statistical analysis, quantitative analysis (Why a particular result occurred?)
· Multivariate testing, A/B testing (Conducting various experiments to test previous decisions)
·  Predictive modellingpredictive analytics (Forecasting the results)
 BA is an umbrella term!
Types of Business Analytics
Reporting or Descriptive analytics
·         Its goal is to review the data and look for indications of success or failure in any given marketing plan; it is a rear view or a description of past events.
Modelling or Predictive analytics
·         It is the use of data to determine the probable future outcome of an event or a likelihood of a situation occurring. In other words, it tries to find rules for guessing or predicting the values of one or more variables in a data set from the values of other variables in the data set.
Data-Driven Strategy
·         It is to strategize a company's growth through the use of gathered data in making data-driven decisions
Clustering
·         It is a way of segmenting a diverse group into a number of subgroups or clusters. It is used to group data that share similar records together into clusters.
Affinity grouping
·         It is to determine which things should go together. It looks for relationships between fields and field values to determine which items go together. It is frequently used in retail chains to plan arrangement of items on store shelves. It consists of finding a model that describes association between items. It is used for predictive modelling and to describe items that frequently occur together.
Visual Analytics/Data visualisation
·         Data visualization refers to technologies that support visualization and sometimes interpretation of data and information. Visual tools can help identify relationships such as trends.
Web Analytics
·         Web analytics is a tool for measuring website traffic but can be used as a tool for business research and market research. Web analytics applications can also help companies measure the results of traditional print advertising campaigns, estimate how traffic to a website changes after the launch of a new advertising campaign. Web analytics provides information about the number of visitors to a website and the number of page views. It helps gauge traffic and popularity trends which is useful for market research.

The Processes of Business Analytics






Business Intelligence and Business Analytics
The term Business Intelligence can be interchangeably used with Business Analytics. Since BA is an umbrella term, BI can be considered as a part of BA. Let us compare the two.

BI vs BA
Business Intelligence
Business Analytics

Answers the questions:
What happened?
When?
Who?
How many?
Why did it happen?
Will it happen again?
What will happen if we change x?
What else does the data tell us that never thought to ask?

Includes:
Reporting (KPIs, metrics)
Automated Monitoring/Alerting (thresholds)
Dashboards
Scorecards
OLAP (Cubes, Slice & Dice, Drilling)
Ad hoc query
Statistical/Quantitative Analysis
Data Mining
Predictive Modeling
Multivariate Testing


References: