Gartner: Magic Quadrant for Data Integration Tools

2 de junho de 2022

8 minute read

Gartner
Ehtisham Zaidi, Sharat Menon, Robert Thanaraj, Eric Thoo, Nina Showell
November 2021
allerin

This is an excerpt of the publication above, detailing only the leader on the segment. For the full version, please, refer to the original publication.

Editor of this electronic version:

Joaquim Cardoso MSc.
The Health Revolution
Multidisciplinary Institute, for Better Health for All
June 2, 2022

The data integration tool market is seeing renewed momentum, driven by requirements for hybrid and multicloud data integration, augmented data management, and data fabric designs.

This assessment of 18 vendors will help data and analytics leaders choose a best fit for their data integration needs.

Significant Events
Closed Corporate Transaction Notification: HVR, Data Integration Tools (09 November 2021)
Closed Corporate Transaction Notification: Fivetran, Data Integration Tools (09 November 2021)

Strategic Planning Assumptions

Through 2022, manual data management tasks will be reduced by 45% through the addition of machine learning and automated service-level management.

By 2023, AI-enabled automation in data management and integration will reduce the need for IT specialists by 20%.

Market Definition/Description

Gartner defines data integration as the discipline comprising the architectural patterns, tools and methodologies that allow organizations to access, harmonize, transform, process and move data spanning various endpoints and across any infrastructure.

The market for data integration tools includes vendors that offer a stand-alone software product (or products) to enable the construction and implementation of data access and data delivery infrastructure for a variety of data integration scenarios.

These include (but are not limited to):

Data Engineering
Cloud Migration
Operational Data Integration: Supporting
Data Fabric

1.Data Engineering:

Building, managing and operationalizing data pipelines in support of various analytics and data science demands (for example, logical data warehouse, analytics/business intelligence [BI] and machine learning, among other analytical use cases) by following defined architectural patterns, tools and methodologies.

This use case also requires the data integration tool vendors to deliver capabilities within their data integration tools that enable:

Support for optimized delivery of analytics:

This is the ability to provide access to various heterogeneous data and application sources and then connect or collect data from these sources into a target store that is optimized for delivering integrated data for analytics use cases.

This includes the ability to support data warehouse deployments and to manage push-down optimizations into these data warehouses to support the transformations needed to create data models that are most suited for an organization’s evolving analytics needs.

Data lake enablement:

Data integration tools support the ingestion of data in its “native” format to data stores that support the data lake requirements of an organization (including cloud object stores or file stores).

However, mere movement of data to the data lake is not enough.

Organizations require their data integration tools to support the transformation and operationalization of the data (which includes data preparation, schema assignment, managing and supporting mappings, and delivering the data to supported and consuming applications).

Self-service data preparation:

The ability to provide user experiences that enable end users to develop and manage integrations independent of the tool vendors’ professional services.

These experiences must support multiple integration personas, particularly integration specialists and ad hoc/citizen integrators.

2.Cloud Migration:

This use case requires data integration tools to support the migration and modernization of data and analytics workloads to public cloud infrastructure — usually involving an architecture that spans on-premises and one or more cloud ecosystems.

This use case also requires the vendors to deliver capabilities within their data integration tools that enable:

Data migration and consolidation:

Data integration tools increasingly address the data movement and transformation needs of data migration and consolidation, such as the replacement of legacy applications or the migration to new computing environments.

Cloud data delivery options:

The ability to deliver data integration capabilities as cloud services for hybrid, multicloud, and intercloud integration scenarios.

3.Operational Data Integration:

Supporting operational/transactional data integration use cases such as master data management and interenterprise data acquisition and sharing (including the ability to create data hubs for integration when needed).

This includes the ability of data integration tools to integrate, consolidate and synchronize data related to critical business processes and to support data governance initiatives.

Common requirements of data integration tools include:

Sourcing and delivery of master data in support of master data management (MDM):

This involves enabling the connectivity and integration of data representing critical business entities such as customers, products and employees.

Data integration tools can be used to build the data access and synchronization processes to support MDM initiatives.

Interenterprise data sharing:

Organizations are increasingly required to provide data to, and receive data from, external trading partners (customers, suppliers, business partners and others).

Data consistency between operational applications:

Data integration tools provide the ability to ensure database-level consistency across applications, both on an internal and an interenterprise basis, and in a bidirectional or unidirectional manner.

4.Data Fabric:

A data fabric architecture enables faster access to trusted data across distributed landscapes by utilizing active metadata, knowledge graphs, semantics and ML capabilities of data integration (as well as other data management tools, including data catalogs and data governance).

Data integration tools must enable the creation and delivery of data fabric design patterns that enable multiple producers and consumers of data to be brought together through better integration, collaboration and automation of data pipelines.

The data fabric use case requires data integration tools to be able to deliver data in various styles (not just batch, but a combination of batch with data virtualization, streaming, messaging or API-based delivery styles).

Importantly, organizations need their data integration tools to be able to both deliver integration data as data services and orchestrate these services (see below):

Data services orchestration:

This is the ability to deploy all aspects of runtime data integration functionality as data services (for example, deployed functionality can be called via a web service interface).

Conclusion

Customers must be able to implement and support these use cases with independent offerings from data integration vendors; the use of third-party tools or data integration capabilities embedded in other solutions should not be required.

Vendors that sell data integration technology as part of other solutions (such as analytics platforms, DBMSs, and packaged or SaaS applications) are not considered data integration tool vendors by Gartner.

Our evaluation of data integration tools does not include open-source frameworks, general-purpose development platforms or programming interfaces.

Such data integration frameworks or platforms, which are “general purpose,” and those that require heavy customization by developers to engineer them for specific data integration scenarios are excluded from this Magic Quadrant.

Vendors evaluated in this Magic Quadrant offer at least one commercial off-the-shelf tool that is purpose-built for data integration and transformation.

Data integration tools are required to execute many of the core functions of data integration, which can be applied to any of the above scenarios.

(For a detailed list and explanation of all core capabilities and use cases of tools in the data integration market, see Critical Capabilities for Data Integration Tools).

Magic Quadrant

Figure 1: Magic Quadrant for Data Integration Tools

Source: Gartner (August 2021)

Vendor Strengths and Cautions [excerpt]

Informatica

Informatica is a Leader in this Magic Quadrant; in the previous iteration of this research, it was also a Leader. Informatica is headquartered in Redwood City, California. It offers the following data integration tools as part of its Intelligent Data Management Cloud: Informatica Intelligent Cloud Services (IICS) (which includes the following services related to data integration: Cloud Data Integration, Cloud Data Integration Elastic, Cloud Mass Ingestion, Cloud Integration Hub and Cloud B2B), Data Engineering Integration, Enterprise Data Preparation, Enterprise Data Catalog, Data Engineering Streaming, PowerCenter, and PowerExchange. Informatica has over 10,000 customers for these product lines. Its operations are geographically diversified and its clients are primarily in the financial services, healthcare and public sectors.

Strengths

Product investments aligned to the data fabric vision: Informatica has continued to invest in tools that are aligned to the data fabric design vision. Informatica CLAIRE, which is an active-metadata-based AI and ML engine, continues to be highlighted by customers for enabling automation in data integration design and delivery. Informatica has also invested in enabling knowledge graphs (through recent acquisitions of Compact Solutions and GreenBay Technologies) as part of its data integration offerings to support complex modeling tasks involving multirelationship data. Finally, customers stitching data fabric designs benefit from the deep integration of Informatica Enterprise Data Catalog with data integration pipelines, which enables active metadata sharing for insights and automation of tasks.

Strength in data engineering use cases: Informatica’s customers praise its scalable data engineering tools for their ability to handle all data movement topologies. Customers also call out the scalability and performance optimization of its tools for complex transformations requiring push-down optimization and native Spark elastic and serverless capabilities for massively parallel processing (MPP) support. Informatica Cloud Mass Ingestion has been praised by data engineers who are looking to support real-time replication from file, database, applications and event streams through one common platform.

Strong execution for operational data integration use cases: Beyond just analytics and data science use cases, customers looking to implement a data hub for integration, governance and sharing (among other operational use cases) call out Informatica’s solutions as relatively mature. The Cloud Integration Hub offering is frequently selected by customers in competitive situations for its ability to support all data modalities (including batch, virtual, streaming and API-based integration) and for its ability to integrate and deliver data independently to a multicloud hybrid environment.

Cautions

Challenges with manual migration from PowerCenter to IICS: Although Informatica continues to support PowerCenter on-premises deployments, a growing number of customers are looking to migrate existing workloads from PowerCenter to IICS. Customers looking to manually repoint PowerCenter mappings to IICS mappings have faced challenges ranging from cost to performance and even productivity of data teams. Reference customers have stated that they needed significant upfront planning and Informatica (or partner) professional services support to successfully make this migration. To accelerate these migration efforts, Informatica has launched a modernization program that includes automated migration utilities.

Less visibility and understanding of Informatica’s new pricing model: Informatica has launched a new consumption-based pricing model based on Informatica Processing Units (IPUs), which is a unit of software licensing capacity measured by service usage. While this model does enable better scaling in cloud environments, existing customers cite challenges in upgrading to this model. Customers find it difficult to forecast usage upfront to understand how many IPUs should be licensed, and they need a better understanding of how existing licensing can be synchronized and mapped to the new model. In order to mitigate some of those challenges, Informatica has launched a new tool that can assist customers with forecasting their IPU needs and providing a degree of overdraft protection on unexpected consumption.

Some DataOps-related challenges: A few Informatica customers requested better DataOps capabilities, including improved scheduling capabilities for pipelines. Informatica has invested in CI/CD support, better regression testing and improved integration with Git, but existing customers are mostly unaware or report challenges with such capabilities.

For the analysis of the other vendors, please, refer to the original publication.

Originally published at https://www.gartner.com

Author

Joaquim Cardoso

The Latest

Amazon Health Launches $49 Telehealth Service

Amil e Dasa Criam Segunda Maior Rede Hospitalar do Brasil: Fusão Estratégica e Preparação para IPO

Microsoft Discontinues Copilot GPT Builder, Sparks Concern Among Subscribers

WHO Report: Tobacco, Junk Food, Fossil Fuels, and Alcohol Industries Linked to Millions of Deaths Annually