Here’s what you need to know about synthetic AI data startup Gretel, which Nvidia reportedly just acquired for over $320 million.
Nvidia has reportedly acquired synthetic data AI startup Gretel to help accelerate Nvidia’s suite of generative AI services for developers.
Gretel has formed big partnerships with the likes of Amazon Web Services, Google Cloud and Microsoft, while at the same time raising approximately $65 million over the past five years.
Nvidia completed the acquisition of San Diego, Calif.-based Gretel on March 19, according to a report by Wired, with the deal valued at over $320 million. The exact financial terms of the deal were unknown.
The synthetic data platform startup has around 80 employees and was founded by a group engineers and developers with backgrounds working for Google, Red Hat, AWS and the U.S. National Security Agency (NSA).
Gretel has a booth at Nvidia’s GTC 2025 conference this week in San Jose, Calif.
[Related: Google Cloud’s 10 New Bold AI Products: Gemini Robotics, AI Coach And Gemma 3]
Nvidia and Gretel did not respond for comment by press time.
Gretel’s platform leverages advanced generative models to create artificial data that retains the statistical properties of real-world datasets while ensuring data privacy. Gretel’s platform supports various data types—such as structured tabular data, time-series data and unstructured text—which allow customers to share, analyze and develop AI models without exposing sensitive or any proprietary IP, according to the company.
The startup will reportedly be folded into Nvidia.
Gretel’s Cloud Partnerships With AWS, Google And Microsoft
Gretel has partnerships with the major cloud players like AWS, Google Cloud and Microsoft to help its joint customers generate high-quality, safe synthetic data for enterprise AI solutions.
In October 2024, Google Cloud and Gretel teamed up to simplify and streamline synthetic data generation for data engineers and data scientists within BigQuery.
In late 2023, Gretel unveiled a partnership with Microsoft Azure and joined the Microsoft for Startups Pegasus Program.
Also in late 2023, Gretel signed a strategic collaboration agreement with AWS aimed at accelerating responsible generative AI development that protects sensitive and personal data.
For Nvidia, the AI superstar can leverage Gretel’s technology to offer developers tools for generating realistic, privacy-preserving datasets along with the ability to train and fine-tune AI models across various applications.
The acquisition also complements Nvidia’s existing synthetic data technologies that are designed to generate synthetic data for training large language models (LLMs).
Nvidia has already been offering synthetic data tools for developers for years. In 2024, Nvidia launched its Nemotron family of open AI models that generate synthetic training data for developers to use in building or fine-tuning LLMs.
What Is Gretel?
Designed by developers for developers, Gretel said its APIs make it easy to generate anonymized and safe synthetic data so users can preserve privacy and innovate faster.
Developers use Gretel to create artificial, privacy-enhanced versions of their sensitive data and to quickly generate new labeled samples to augment limited machine learning training datasets—all on-demand.
The company says over 150,000 developers are using Gretel.
Gretel also offers an application programming interface (API) that allows for integration into existing workflows, allowing developers to generate and customize synthetic datasets on demand. Users can also fine-tune the balance between data fidelity and anonymization, all while being guided by built-in privacy controls, according to the company.
Synthetic Data In The AI Era
Some of the world’s largest AI providers—like AWS, Microsoft, Meta and Anthropic—are already leveraging synthetic data to train their AI models on as sources of real-world data become scarce.
Meta Llama 3 LLM uses synthetic data, while Amazon’s Bedrock AI platform lets developers use Anthropic’s Claude to generate synthetic data.
Synthetic data includes text, videos, and images that can be combined with real-world data to train AI models more efficiently and affordably.