DeepSeek vs OpenAI: The Legal Battle Over AI Training & Copyright

DeepSeek vs. OpenAI has rapidly become one of the most closely watched rivalries in the tech world. DeepSeek’s meteoric rise has shaken the global AI sector and the stock markets alike. Nvidia alone lost nearly USD 600 billion in market capitalization in a single day, while other major tech firms also faced heavy losses. DeepSeek has quickly captured global attention, becoming the most downloaded app on both the Apple App Store and Google Play Store.

OpenAI, the creator of ChatGPT, has accused DeepSeek of inappropriately distilling and leveraging OpenAI’s proprietary technology to train its AI models, allegedly breaching its terms of use. Yet OpenAI itself is no stranger to controversy—it faces multiple lawsuits worldwide over alleged copyright infringement in its training datasets. The DeepSeek vs. OpenAI battle is therefore not just about market dominance, but also about intellectual property rights and the ethical boundaries of AI development.

Two interesting issues arise from this:

Is it copyright infringement to use copyrighted material to train AI models?
Is it copyright infringement if an AI model uses output from another AI model for training purposes?

The answer to the first issue is not clear cut and would differ under the laws of different jurisdictions. In general, such use without the consent of the rights holders would be infringing, although there are possible exceptions and defences. In Singapore, for example, under section 244 of the Copyright Act 2021, a person is permitted to make copies of copyright material for the purpose of computational data analysis (which would include AI training). If the conditions provided under section 244 are satisfied, there would be no copyright infringement. Another exception which could apply is the fair use exception under section 191 of the Copyright Act 2021. Much would depend on how exactly the copyrighted material has been used. But the position may differ significantly in other jurisdictions – for example, some jurisdictions may not have a computational data analysis exception, or that exception may be much narrower in scope than Singapore’s. Or they might have different criteria and considerations in assessing fair use.

The other point in respect of the first issue is that it may be practically difficult to identify if and how copyrighted material may have been copied, given the black box nature of AI large language models (LLMs).

On the second issue, there is a serious question as to whether AI-generated output can be copyrighted at all. The answer is quite probably “no”. In most jurisdictions (including Singapore), copyright only subsists where there is a human author. But where AI-generated works are concerned, is there a human author at all? It would seem odd for authorship of such works to be attributed to the engineers behind the AI model. And it would be similarly (perhaps even more) odd for the authorship to reside with the person inputting prompts into the AI model to get output. And what if the “person” who is inputting the prompts is itself an AI model? There are no good answers to these questions, and the situation will likely only become more settled if or when legislatures around the world update their laws to squarely address the advent of generative AI.

Until then, the best protection for operators of AI models against unintended use of their models (e.g. distillation) would be through carefully crafted terms of use. For example, ChatGPT’s terms of use provide that users may not “Automatically or programmatically extract data or Output”. It remains to be seen if such terms have provided OpenAI with sufficient protection against the likes of DeepSeek.

For more information, please contact Basil Lee at basil.lee@helmsmanlaw.com.

Disclaimer

This publication is provided for general information purposes only and does not constitute legal or professional advice. It does not purport to be comprehensive or address every aspect of the matters discussed. While we strive to ensure the accuracy of the information at the time of publication, we make no representations or warranties as to its accuracy, completeness, or suitability for any particular purpose. You should seek specific legal or professional advice before taking any action based on the contents of this publication. We do not accept any liability for any loss or damage arising from any reliance placed on this publication or its contents. No lawyer-client relationship is created by this publication.

We stand ready to help you capture the opportunities and navigate unchartered territory. To find out more, please feel free to contact us:

Commodities

Zhida Chen