• HOME
  • CATEGORIES

    • CATEGORIES

    • Browse All Categories
  • FOR VENDORS

    • FOR VENDORS

    • Log In to Vendor Portal
    • Get Started
  • REVIEWS

    • REVIEWS

    • Write a Review
    • Product Reviews
    • Vendor Directory
    • Product Comparisons
  • GARTNER PEER COMMUNITY™
  • GARTNER.COM
  • Community GuidelinesListing GuidelinesBrowse VendorsRules of EngagementFAQPrivacyTerms of Service
    ©2026 Gartner, Inc. and/or its affiliates.
    All rights reserved.
  • Categories

    • Loading categories...

      Browse All Categories

      Loading markets...

  • For Vendors

    • Log In to Vendor Portal 

    • Get Started 

  • Write a Review

Join / Sign In
All Categories
/
Hadoop Distributions

Hadoop Distributions Reviews and Ratings

What are Hadoop Distributions?

Hadoop distributions are used to provide scalable, distributed computing against on-premises and cloud-based file store data. Distributions are composed of commercially packaged and supported editions of open-source Apache Hadoop-related projects. Distributions provide access to applications, query/reporting tools, machine learning and data management infrastructure components.

First introduced as collections of components for any use case, distributions are now often delivered as part of a specific solution for data lakes, machine learning or other uses. They subsequently grow into additional, expanded roles, competing with both older technologies like database management systems (DBMSs) and newer ones like Apache Spark.

How Categories and Markets Are Defined

Product Listings

Filter by

Products 1 - 20 of 22
Sort by
Logo of Amazon Athena

Amazon Athena

By Amazon Web Services (AWS)

4.4
(90 Ratings)

Amazon Athena is a software that allows users to analyze data directly in Amazon Simple Storage Service using standard SQL queries. It is serverless, removing the need for infrastructure management, and is designed to handle structured, unstructured, and semi-structured datasets. The software provides features such as querying, data catalog integration, and compatibility with various file formats including JSON, CSV, and Parquet. Amazon Athena can be used to retrieve insights from data for reporting, analytics, and data exploration purposes, enabling organizations to address business problems related to extracting value from large and complex datasets stored in the cloud.

Show More Details
Logo of Amazon EMR

Amazon EMR

By Amazon Web Services (AWS)

4.4
(60 Ratings)

Amazon EMR is a software for processing and analyzing large datasets using open-source tools such as Apache Spark, Apache Hadoop, and Presto. The software enables users to run distributed data processing workloads on scalable cloud infrastructure, automating provisioning and configuration of cluster resources. It supports a range of data analytics tasks including batch processing, machine learning workflows, and interactive SQL queries. Amazon EMR software is designed to address challenges related to managing big data environments, helping organizations reduce operational overhead and optimize resource usage for analytics and business intelligence initiatives.

Show More Details
Logo of HPE Data Fabric Software

HPE Data Fabric Software

By Hewlett Packard Enterprise (HPE)

4.3
(37 Ratings)

HPE Data Fabric Software is a data platform software designed to enable organizations to manage, access, and analyze large-scale data across hybrid and multicloud environments. The software provides features such as data storage, data integration, and real-time data streaming, supporting both structured and unstructured data. It offers unified data access, support for various analytics tools, and capabilities for data governance and security. By facilitating the seamless movement and management of data, the software addresses challenges related to managing diverse data types, enabling organizations to derive insights and build data-driven applications while maintaining data consistency and control across distributed infrastructures.

Show More Details
Logo of Azure Data Lake Store

Azure Data Lake Store

By Microsoft

4.3
(26 Ratings)

Azure Data Lake Store is a software designed to facilitate the storage and analysis of large volumes of data. The software provides a scalable and secure data repository that integrates with analytical tools and supports high throughput for data workloads. It enables the organization of files and datasets of various formats, allowing parallel processing and direct access for big data analytics. Azure Data Lake Store addresses the business problem of managing and analyzing diverse and extensive datasets by offering hierarchical namespace, fine-grained access control, and compatibility with existing enterprise security frameworks. The software is intended for enterprises seeking to optimize their data architecture for analytics, machine learning, and reporting.

Show More Details
Logo of FusionInsight Big Data Platform

FusionInsight Big Data Platform

By Huawei

4.9
(19 Ratings)

FusionInsight Big Data Platform is a software that enables data storage, processing, and analysis across various industries by integrating multiple big data components such as Hadoop, Spark, Hive, and HBase. The software supports distributed architecture and manages structured and unstructured data at scale. It offers capabilities for real-time data processing, batch processing, and data mining, providing tools for managing, querying, and visualizing large datasets. The software is designed to solve business problems related to data integration, complex analytics, and performance optimization, supporting workflows and automation that help organizations derive insights, make data-driven decisions, and enhance operational efficiency.

Show More Details
Logo of Google Cloud Platform

Google Cloud Platform

By Google

4.6
(16 Ratings)

Google Cloud Platform is a software that offers a suite of cloud computing services, including infrastructure as a service, platform as a service, and serverless computing environments. It provides tools for computing, storage, networking, data analytics, artificial intelligence, and machine learning. The software supports the deployment and scaling of applications and services on a highly available and secure global infrastructure. It enables organizations to manage workloads, develop and run applications, and analyze large volumes of data to address business challenges such as resource optimization, faster time-to-market, and scalability for enterprise and developer needs.

Show More Details
Logo of IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop

By IBM

4
(11 Ratings)

IBM BigInsights for Apache Hadoop is a software designed to support large-scale data management and analytics by leveraging the capabilities of the Apache Hadoop ecosystem. The software provides tools for managing, processing, and analyzing vast amounts of structured and unstructured data. It integrates with Hadoop distributions to offer features such as advanced analytics, indexing, text analytics, and data visualization. IBM BigInsights also includes components for security, data governance, and workload optimization. The software addresses business needs related to big data processing, enabling organizations to derive value from complex data sets and improve decision-making efficiency.

Show More Details
Logo of Azure Data Lake Analytics

Azure Data Lake Analytics

By Microsoft

4.2
(6 Ratings)

Azure Data Lake Analytics is a software designed for distributed analytics that allows users to process large volumes of data on demand. The software enables users to develop and run parallel data transformation and processing programs in U-SQL, R, Python, and .NET over petabytes of data. It provides features such as dynamic scaling, on-the-fly resource allocation, and the ability to handle complex queries across multiple data sources. Azure Data Lake Analytics is aimed at addressing business challenges related to big data by simplifying data integration, improving processing speed, and supporting custom analytics jobs without the need for managing infrastructure. The software helps organizations derive insights from structured and unstructured data, thereby supporting data-driven decision making.

Show More Details
Logo of Hortonworks Sandbox

Hortonworks Sandbox

By Cloudera

4.2
(6 Ratings)

Hortonworks Sandbox is a software environment designed to provide users with a pre-configured platform for learning, developing, and testing data applications using open-source technologies that are part of the Hadoop ecosystem. The software includes components such as Hadoop Distributed File System, Apache Hive, Apache Pig, and Apache HBase. It offers a virtual environment for experimenting with big data processing, data analysis, and management workflows without the need for complex setup. Hortonworks Sandbox addresses business challenges related to understanding and prototyping big data solutions in a contained environment, enabling users to explore data integration, transformation, and analysis scenarios.

Show More Details
Logo of Azure HDInsight

Azure HDInsight

By Microsoft

3.6
(5 Ratings)

Azure HDInsight is a cloud-based software designed to process, analyze, and manage large volumes of data using open-source frameworks such as Apache Hadoop, Apache Spark, and Apache Hive. The software provides scalable and customizable clusters for big data analytics, enabling organizations to handle data storage, processing, and reporting. Azure HDInsight supports data integration from various sources, facilitates batch, interactive, and streaming analytics, and offers built-in monitoring and security capabilities. The software addresses business challenges associated with distributed data processing, helping users derive insights from large datasets for applications such as business intelligence, data science, and machine learning workloads.

Show More Details
Logo of Big Data Appliance (Legacy)

Big Data Appliance (Legacy)

By Oracle

4.1
(5 Ratings)

Big Data Appliance (Legacy) is a software developed to support organizations in managing, processing, and analyzing large volumes of structured and unstructured data. The software integrates various open-source components for big data processing, such as Hadoop and NoSQL databases, and provides a platform for data integration, loading, and transformation. It offers features related to data warehousing, analytics, and machine learning, enabling users to derive insights from diverse data sources. The software is designed to address the business challenge of efficiently storing and analyzing big data sets, streamlining the process of deriving business intelligence and supporting data-driven decision making.

Show More Details
Logo of Oracle Big Data SQL

Oracle Big Data SQL

By Oracle

5
(5 Ratings)

Oracle Big Data SQL is a software designed to enable querying data across Hadoop, NoSQL, and Oracle Database environments using SQL. It provides a unified data access layer that allows organizations to analyze large volumes of structured and unstructured data while leveraging existing SQL-based tools and applications. The software supports data integration from multiple sources, helping businesses address challenges related to data silos and enabling advanced analytics and reporting. By facilitating high-performance queries and scalable data processing, it supports decision-making processes that depend on combining diverse data sets without the need to move or duplicate data across platforms.

Show More Details
Logo of IBM Open Platform (IOP)

IBM Open Platform (IOP)

By IBM

4
(3 Ratings)

IBM Open Platform (IOP) is a software designed to support and manage open source big data tools and components, focusing on integration with the Hadoop ecosystem. The software offers features such as data storage, management, processing, and analytics using technologies including Apache Hadoop, Apache Hive, Apache HBase, Apache Spark, and Apache Ambari. IBM Open Platform (IOP) addresses business needs for scalable and flexible data handling, enabling organizations to process large volumes of structured and unstructured data for analysis and reporting purposes. The software provides a framework that allows users to deploy, manage, and monitor complex big data solutions on-premises or in cloud environments, supporting enterprise requirements for data analysis, governance, and workflow automation.

Show More Details
Logo of Oracle Big Data

Oracle Big Data

By Oracle

4.7
(3 Ratings)

Oracle Big Data is a software that provides tools and technologies for managing, processing, and analyzing large volumes of structured and unstructured data. It offers capabilities for data integration, storage, and advanced analytics across various formats and sources. The software supports distributed computing environments, enabling users to leverage Hadoop, NoSQL databases, and machine learning frameworks. Oracle Big Data addresses business challenges related to transforming raw data into actionable insights, optimizing data workflows, and enhancing decision making through scalable analytics. The software is designed to support enterprise data management requirements while enabling interoperability with existing systems and cloud platforms.

Show More Details
Logo of Cloudera Director

Cloudera Director

By Cloudera

4
(2 Ratings)

Cloudera Director is a software designed to deploy, scale, and manage Apache Hadoop clusters in cloud environments. The software enables users to automate the provisioning of clusters, configure resources, and monitor operations to support big data workloads. Cloudera Director supports integration with various cloud service providers and facilitates dynamic scaling based on workload demands. The software offers features such as customizable cluster templates, security and governance controls, and monitoring tools to help address the challenges of managing distributed data processing systems. It aims to streamline the process of running data analytics infrastructure in cloud environments by offering centralized management and operational controls.

Show More Details
Logo of Seabox BigData Platform

Seabox BigData Platform

By Eastern Jin Technology (Seabox Data)

5
(2 Ratings)
Show More Details
Logo of Hadoop as a Service

Hadoop as a Service

By Idera (Qubole)

4
(1 Rating)

Hadoop as a Service by Qubole is a cloud-based software that provides scalable data processing and analytics capabilities using Apache Hadoop. The software allows organizations to manage, process, and analyze large volumes of structured and unstructured data without the need to set up or maintain physical infrastructure. It automates cluster management, optimizes resource allocation, and supports various data sources and formats. The software is designed to support batch processing, data transformation, and advanced analytics, helping businesses address challenges related to big data management, cost efficiency, and data-driven decision making. It also integrates with different data processing engines while providing monitoring and security features.

Show More Details
Logo of HDCloud (Legacy)

HDCloud (Legacy)

By Cloudera

4
(1 Rating)

HDCloud (Legacy) is a software designed to facilitate the deployment and management of Apache Hadoop clusters in cloud environments. The software provides tools for automating cluster provisioning, configuration, and scaling, enabling users to efficiently process large volumes of data. HDCloud (Legacy) supports integration with cloud infrastructure providers and allows for elastic resource allocation based on workload requirements. The software addresses the challenge of handling big data workloads by simplifying the setup and operation of distributed computing resources in a cloud setting. HDCloud (Legacy) offers features for monitoring, security, and administration of Hadoop clusters, aiming to streamline big data workflows within organizations.

Show More Details
Logo of Transwarp Data Hub

Transwarp Data Hub

By Transwarp

5
(1 Rating)

Transwarp Data Hub is a software designed to integrate, manage, and process large-scale structured and unstructured data from diverse sources within an enterprise environment. The software offers features such as data ingestion, transformation, synchronization, and governance, supporting centralized data storage and unified access. It facilitates real-time and batch data processing, metadata management, and secure data sharing across multiple business systems. Transwarp Data Hub addresses business needs related to data silos, enabling organizations to consolidate, organize, and analyze data for operational efficiency, regulatory compliance, and informed decision-making.

Show More Details
Logo of Oracle Big Data Cloud Service (Legacy)

Oracle Big Data Cloud Service (Legacy)

By Oracle

Oracle Big Data Cloud Service is a software designed to enable organizations to manage, analyze, and process extensive volumes of structured and unstructured data across cloud environments. The software integrates a range of open source big data technologies such as Hadoop, Spark, and Kafka, combined with Oracle’s security and automation capabilities. It offers a scalable and flexible platform for storing and computing large datasets, facilitating advanced analytics, machine learning, and real-time data streaming. The software addresses business challenges related to capturing, organizing, and deriving insights from diverse data sources, supporting data-driven decision making and operational efficiency within enterprise settings.

Gartner Research

Market Guide for Hadoop Distributions

Popular Product Comparisons

Amazon EMR vs Google Cloud PlatformAmazon EMR vs Azure Data Lake StoreAzure Data Lake Store vs Google Cloud Platform

Gartner Peer Insights content consists of the opinions of individual end users based on their own experiences, and should not be construed as statements of fact, nor do they represent the views of Gartner or its affiliates. Gartner does not endorse any vendor, product or service depicted in this content nor makes any warranties, expressed or implied, with respect to this content, about its accuracy or completeness, including any warranties of merchantability or fitness for a particular purpose.

This site is protected by hCaptcha and its Privacy Policy and Terms of Use apply.


Software reviews and ratings for EMMS, BI, CRM, MDM, analytics, security and other platforms - Peer Insights by Gartner
Community GuidelinesListing GuidelinesBrowse VendorsRules of EngagementFAQsPrivacyTerms of Use

©2026 Gartner, Inc. and/or its affiliates.

All rights reserved.