Soda is an AI-native augmented data quality platform that helps customer find, understand and fix data quality issues as fast as possible. We meet users where they are: engineers can manage everything as code in Git, while business users create and review them in a business interface. Together they work in a shared workflow, powered by AI, to set quality expectations, monitor metrics with AI, and automatically isolate and remediate bad data directly in their environment. By uniting teams, automating with AI, and securing trust at the source, Soda restores confidence in data and decisions.
Do You Manage Peer Insights at Soda Data?
Access Vendor Portal to update and manage your profile.
Ok overall, we really really love the service. Here are the three strengths that come to mind. 1. Strong data quality approach: Soda's biggest strength is its declarative setup for defining data checks in code, versioning them and integrating them directly into pipelines. It's clean, scalable, and fits well into modern data workflows. 2. Great integration with modern data stacks. It plugs nicely into tools like dbt, Airflow and CI/CD workflows, which makes it easy to embed quality checks early in the pipeline rather than fixing issues later. 3. Real-time monitoring and anomaly detection: The platform can monitor large datasets and detect anomalies quickly, helping teams catch issues before they affect dashboards or business decisions.
The help and support provided. Soda listens to the customer.
- Variety of data sources supported and integrations available - Customer support is highly responsive and ready to discuss further improvements based on the company's needs - Flexibility and ease of use, due to the fact that it supports both a UI and a programmatic integration
Ok so nothing major here, the strong points outweigh the bad ones. If I had three points to say, I would say 1. Not very accessible for non-technical teams: Even though there's a UI, Soda is still quite code-first. Marketing and ops teams won't get as much value without support from the data engineers. 2. Features split between open-source and paid: The version we used was great, but many of the more advanced features sit behind a cloud, which can limit the value unless you upgrade.Finally, 3. Pricing can get tricky as you scale: pricing is dataset-based and generally fair, but it's not always obvious what you'll need long-term so costs can increase as usage grows.
Limited documentation, release notes on new versions incomplete on changes that need to be done. Immature functionality, but at the same time brings focus while the product is still under development.
- UI can become unresponsive if there are too many checks/data assets onboarded (at least in past versions) - UI does not offer the full extent of visibility required by our data engineering teams - The extent of programmatic integration in newer version is less than expected and it creates friction for onboarding