DMR News

Advancing Digital Conversations

Snowflake invests in Metaplane to address data quality challenges hindering AI development.

ByYasmeeta Oon

May 18, 2024

Snowflake invests in Metaplane to address data quality challenges hindering AI development.

Today, Snowflake announced an investment in Metaplane, a Boston-based startup specializing in identifying and rectifying data quality issues using an end-to-end AI-powered platform. Although the investment amount remains undisclosed, Snowflake states that this move will lead to a tighter integration between Metaplane’s data observability services and the Snowflake data cloud. This integration aims to help users monitor and maintain the quality of their data, which is essential for various downstream projects, including AI applications.

The collaboration between Snowflake and Metaplane is set to significantly impact how enterprises manage and ensure data quality. Metaplane, which competes with heavily funded players like Monte Carlo and Acceldata, will launch a native app for the Snowflake platform. This initiative marks Snowflake’s fifth investment this year and the second in the observability domain, following their March investment in Observe, a company that analyzes telemetry from enterprise applications to help users quickly identify and resolve incidents.

In the modern business landscape, data drives applications, including AI chatbots based on retrieval-augmented generation (RAG). However, many organizations struggle to maintain data quality due to the sheer volume of information spread across siloed systems, databases, and applications. Teams often find it challenging to track and identify issues or abnormalities within their data. Complex pipelines can involve hundreds or even thousands of sources that need to be managed effectively.

Metaplane addresses this issue by applying AI at different layers of the data stack, from ingestion to consumption. Founded by MIT graduate Kevin Hu, former HubSpot engineer Peter Casinelli, and ex-Appcues developer Guru Mahendran, Metaplane integrates with various tools across the data stack—such as Fivetran, BigQuery, dbt, Airflow, and Tableau. The platform employs machine learning (ML) models to analyze the entire data profile, encompassing historical metadata, lineage, and logs. Once trained, these models automatically flag data anomalies, including schema changes, based on user-defined monitors.

In an interview with VentureBeat, Kevin Hu explained that setting up these monitors takes about 15 minutes, allowing users to track data quality metrics such as freshness, row count, uniqueness, and nullness. Alerts are sent directly to the relevant data teams via their preferred communication channels.

With Snowflake’s investment, Metaplane will deepen its integration with the Snowflake data cloud. This enhanced integration will cover more telemetry and metadata on the platform, including entire data pipelines and capabilities such as Snowpark, Snowpark Container Services, Snowflake Native Apps, and Streamlit. This will enable Snowflake customers to closely monitor the quality of their data as it progresses through various pipeline stages, powering downstream applications. In the event of any issues, Metaplane will notify users about the problem, its root cause, and the most appropriate resolution.

Although the exact timeline for this deeper integration is yet to be confirmed, Snowflake has stated that Metaplane will also release a native app on the Snowflake data cloud. This will allow users to deploy and manage Metaplane directly within their Snowflake instance, eliminating the need to connect Snowflake separately as with other data tools.

Since Sridhar Ramaswamy became CEO, Snowflake has aggressively embraced AI to better compete with companies like Databricks, which has focused on AI from its early stages. At last year’s Snowday event, Snowflake launched Cortex, a fully managed service for building generative AI applications using data stored in the cloud. Over the following months, Snowflake partnered with several open-source AI vendors, including Mistral and Reka, to offer their models on Cortex, facilitating the development of applications for various use cases. Additionally, Snowflake trained Arctic, its own large language model (LLM) optimized for complex enterprise workloads such as SQL generation, code generation, and instruction following. They also introduced a copilot experience to help users explore and understand their data.

Before investing in Metaplane, Snowflake backed four other companies to strengthen its data and AI capabilities: Coda, Coalesce, Observe, and Landing AI.

Key Investments by Snowflake in 2024
CompanyFocus AreaMonth of Investment
CodaCollaborative Document PlatformJanuary
CoalesceData Transformation and ModelingFebruary
ObserveTelemetry Analysis and Incident ResolutionMarch
Landing AIAI Solutions for Industrial ApplicationsApril
MetaplaneData Observability and Quality ManagementMay
  • Enhanced Data Quality: Improved ability to monitor and maintain data quality across complex pipelines.
  • Seamless Integration: Native app for Snowflake, allowing direct deployment and management within Snowflake instances.
  • Advanced AI Capabilities: Use of machine learning to detect data anomalies and provide actionable insights.
  • Comprehensive Coverage: Extended telemetry and metadata analysis covering entire data pipelines and Snowflake capabilities.
  • Efficient Incident Resolution: Real-time alerts and detailed root cause analysis for data issues.

Ashwin Kamath and Harsha Kapre, who manage product development at Snowflake, highlighted the benefits of this integration in a joint blog post. They noted that this collaboration would open the door to richer experiences for customers, enabling them to fully leverage Metaplane’s capabilities without moving or copying their data outside the secure, governed environment of their Snowflake accounts.

Snowflake’s investment in Metaplane represents a strategic move to bolster its data quality management capabilities through AI-powered solutions. As enterprises increasingly rely on data to drive their applications, ensuring the accuracy and reliability of that data is paramount. With this enhanced integration, Snowflake aims to provide its customers with the tools they need to maintain high data quality standards, ultimately supporting more robust and reliable downstream applications. This partnership underscores Snowflake’s commitment to advancing its AI capabilities and delivering innovative solutions to meet the evolving needs of its customers.


Related News:


Featured Image courtesy of DALL-E by ChatGPT

Yasmeeta Oon

Just a girl trying to break into the world of journalism, constantly on the hunt for the next big story to share.

Leave a Reply

Your email address will not be published. Required fields are marked *