Data preparation is an iterative and agile process for finding, combining, cleaning, transforming and sharing curated datasets for various data and analytics use cases including analytics/business intelligence (BI), data science/machine learning (ML) and self-service data integration. Data preparation tools promise faster time to delivery of integrated and curated data by allowing business users including analysts, citizen integrators, data engineers and citizen data scientists to integrate internal and external datasets for their use cases. Furthermore, they allow users to identify anomalies and patterns and improve and review the data quality of their findings in a repeatable fashion. Some tools embed ML algorithms that augment and, in some cases, completely automate certain repeatable and mundane data preparation tasks. Reduced time to delivery of data and insight is at the heart of this market.
MDM is a technology-enabled business discipline in which business and IT work together to ensure the uniformity, accuracy, stewardship, governance, semantic consistency and accountability of an enterprise’s official shared master data assets. Master data has the lowest number of consistent and uniform sets of identifiers and attributes that uniquely describe the core entities of the enterprise and are used across multiple business processes.