Google has disclosed that “commercially motivated” actors attempted to clone its Gemini AI by simply prompting the model, collecting responses across many languages to train a cheaper copycat. The company frames this activity as model extraction and warns it amounts to intellectual property theft.
The practice, known in the industry as distillation, trains a new model on Gemini’s outputs rather than its underlying data or code, letting copycats mimic its behavior at a fraction of the development cost.
Attacks reportedly originated from around the world, with Google saying private firms and researchers accounted for most attempts. The company declined to name suspects while flagging that such means of copying raise IP concerns.
Google’s account references earlier episodes, including a 2023 report that Bard may have used outputs from OpenAI’s ChatGPT via ShareGPT to aid training. Senior Google researcher Jacob Devlin warned this violated OpenAI’s terms before resigning to join OpenAI, with Google later denying the claim though reportedly halting the practice.
Distillation remains a widely used technique for building smaller LLMs, both inside and outside of major labs. Google says the line between legitimate distillation and improper copying hinges on permission for the source model and its data, complicating enforcement in courts.