Overview
Product Information on Amazon EMR
What is Amazon EMR?
Amazon EMR Pricing
Overall experience with Amazon EMR
“ Amazon EMR’s Impact on Cost Savings and Reporting Service Performance Improvements”
“Awesome product but needs some changes”
About Company
Company Description
Amazon Web Services (AWS), established in 2006, is focused on providing essential infrastructure services to businesses globally in the form of cloud computing. The key advantage offered through cloud computing, particularly via AWS, is its capacity to shift fixed infrastructure expenses into flexible costs. Businesses have been able to forgo extensive planning and procurement of servers and other Information Technology (IT) resources, owing to AWS. AWS seeks to provide businesses with prompt and cost-effective access to resources using Amazon's expertise and economies of scale, as and when their business requires. Currently, AWS offers a robust, scalable, economic infrastructure platform on the cloud powering an extensive array of businesses worldwide. It operates across numerous industries with data center locations in various parts of the globe including U.S., Europe, Singapore, and Japan.
Company Details
Do You Manage Peer Insights at Amazon Web Services (AWS)?
Access Vendor Portal to update and manage your profile.
Key Insights
A Snapshot of What Matters - Based on Validated User Reviews
Reviewer Insights for: Amazon EMR
Performance of Amazon EMR Across Market Features
Amazon EMR Likes & Dislikes
The most significant advantage of Amazon EMR is its exceptional cost management capabilities. EMR's unique architecture with core nodes (24/7) and transient task nodes that automatically turn on and off is unparalleled by other providers, who typically only offer all-on-demand or all-spot auto-scaling groups. Another highly valued feature is the flexibility in choosing instance families. For example, within the R5 family, we can select from R5, R5A, R5G (graviton, which saves cost), or R5D (memory-optimized). The ability to use multiple node types within a family and assign weightage (e.g., prioritizing cost-saving Graviton instances) provides unmatched flexibility tailored to our specific use case. This also extends to time-dependent scaling, allowing us to allocate more on-demand instances during peak US hours and shift to more spot instances during India's non-peak hours, resulting in significant daily cost savings. Furthermore, EMR's seamless integration with other AWS services, such as Amazon S3, is a major benefit. Our underlying data layer is S3, and EMR provides an intuitive, built-in integration that largely eliminates the need for custom code, relying instead on simple dropdown configurations. It also integrates easily with open-source coding platforms we utilize. Finally, EMR prioritizes customer experience and ensures operational resilience. If spot machines are unavailable during peak times, EMR automatically waits for a configured period (e.g., 5 minutes) and then provisions on-demand instances to prevent customer impact. While this may temporarily increase cost, nodes automatically shut down when the load subsides. EMR also boasts built-in fault tolerance, automatically spinning up a new node if one goes down, ensuring continuous availability and preventing downtime.
hands on, configurability
Great high compute platform
My primary concern with Amazon EMR is a significant limitation in its configuration flexibility: the inability to change the instance family of a cluster once it has been created from the UI. If an incorrect or higher-cost instance family is selected during initial cluster setup, there is no direct way to modify it. To rectify such an error, one must terminate the existing cluster, create a clone, make the necessary changes, and then redeploy. This process inevitably leads to downtime or necessitates incurring additional cost by running a parallel cluster until the new one is operational. This is a major pain point, as I've observed similar feedback online from other customers. While EMR allows for editing parameters like the number of spot or on-demand nodes, the fundamental instance family cannot be altered post-implementation. Most other AWS features offer an edit option, and its absence for EMR cluster instance families is a notable drawback that, if resolved, would significantly enhance the product's usability and flexibility for many users.
finding failure is hard
not much i can think off.
Top Amazon EMR Alternatives
Peer Discussions
Amazon EMR Reviews and Ratings
- Senior Director Of Technology1B-10B USDIT ServicesReview Source
Amazon EMR’s Impact on Cost Savings and Reporting Service Performance Improvements
My organization has utilized Amazon EMR for about 45 days, and the overall experience has been great. We chose EMR due to our existing Amazon Heavy infrastructure, evaluating AWS solutions over external vendors. EMR provided more flexibility in cost management than auto-scaling groups, especially with its dynamic handling of node counts and mixing on-demand and spot instances. EMR addressed key business pain points: Firstly, it delivered significant cost savings. We reduced daily costs by 40-45% for a service previously spending on 100% on-demand instances, by implementing an 80/20 distribution of spot to on-demand nodes. This projects potential savings per month if all our Hadoop loads transition to EMR. Other teams have also started using EMR for heavy Hadoop loads due to its cost optimization. Secondly, EMR vastly improved our heavy reporting service for customers, which previously suffered from report queuing and throttling during peak morning hours. Our report response time improved dramatically, from approximately 30 mins to 2 mins. The system spins up additional nodes when reports are triggered and shuts them down during lean periods, preventing cost incurrence. This directly led to the retention of a significant FMCG giant customer in the US, a contract that could have resulted in over a 5% loss of our overall revenue, with minimal deployment effort. From an onboarding perspective, my prior experience with EMR made the process straightforward. AWS offers extensive use cases, design diagrams, and paid support typically responds within 2-24 hours. Customization was not a major concern, as EMR’s features and documentation meet most industry needs. EMR efficiently handles large-scale data processing, with configurable node limits and built-in fault tolerance. A caution regarding performance: for very large node during peak hours, it's advised to maintain at least a 50/50 split between spot and on-demand instances to avoid availability issues. - Director of Engineering50M-1B USDBankingReview Source
EMR Delivers Strong Compute Capacity for Pyspark-Based Data Processing Tasks
EMR provides high compute capacity for our data processing needs. We have several models being executed on EMR using Pyspark - VP, Data and AnalyticsGov't/PS/EdGovernmentReview Source
Platform Utilizes Map Reduce Components and Supports Spot CPU Usage
the platform is consist of standard of map reduce components and can use spot CPU - Engineer<50M USDBankingReview Source
Comprehensive Spark Configurations Available Through Flexible API Calls in This Product
very complete, i can work better spark configs with api calls than in other services - Data Analyst50M-1B USDRetailReview Source
Awesome product but needs some changes
good, but can be better with more features and controls for engineers


