An AI Data Privacy Cautionary Tale: Court-Ordered Data Retention Meets Privacy
“Data systems are designed to remember data, not to forget data.”
Debbie Reynolds, "The Data Diva"
The rapid rise of generative artificial intelligence technologies and their increasing usage have brought both breakthrough innovation and mounting data risks. I have spent years telling individuals and organizations in my keynotes and writings not to submit personal, sensitive, confidential, or proprietary data into generative AI systems. Why? Once data enters these systems, organizations and users often lose control over what happens to it. As I have said time and again, “Data systems are designed to remember data, not to forget data.” That reality has now been confirmed by a federal court order in the case of New York Times v. OpenAI.
In a significant development with far-reaching implications, the United States District Court for the Southern District of New York recently ordered OpenAI, the company behind ChatGPT, to preserve all user interactions across its AI products. This includes chat logs, input prompts, and generated outputs from both individual users and businesses accessing the system via API. The order also suspends all deletion mechanisms previously in place, including those that allowed users to request the removal of their data. OpenAI has publicly acknowledged the implications, calling the order a “privacy nightmare” (Ars Technica article). The full preservation order, dated May 13, 2025, is publicly available (court order PDF).
This court decision serves as a cautionary tale, illustrating what can happen when privacy, litigation, and artificial intelligence systems intersect. It confirms that the protections many users believe they have when interacting with AI platforms can evaporate the moment those platforms become entangled in legal proceedings.
When “Delete” Does Not Mean “Delete”
Many users assume that deleting data from a platform actually removes it. In reality, deletion often removes only the visible or user accessible layer, while other parts, such as system backups, logs, caches, and training artifacts, remain intact. In the context of generative AI systems, this becomes even more complicated. These models may absorb patterns from user inputs and retain them in ways that cannot be undone, even if the user-facing chat log is cleared.
Now, with the court’s order in effect, OpenAI is legally required to suspend deletion capabilities altogether. All user data, past and future, is to be preserved. This includes information that users may have intended to be temporary or confidential in nature. The decision affects every user of ChatGPT and API-based integrations, regardless of location or purpose.
This development is a stark reminder that the legal system can override user expectations of privacy. It also demonstrates that generative AI platforms store data in a manner that is inherently more persistent than other types of data systems. Once information enters these systems, users may not be able to retrieve, control, or erase it in the conventional sense.
API Business Users Are Not Exempt
Many businesses mistakenly believe that using the API version of ChatGPT (or other AI tools) offers greater protection separate from the general user data pool. Some assume their enterprise integration insulates them from broader data retention or usage. That belief has now been proven incorrect with this court order.
This data preservation order applies to all ChatGPT interactions, including those submitted through API calls and those generated by the API. This means any business using ChatGPT for internal tools, customer interactions, or employee-facing applications must now assume that every prompt and response is being logged, stored, and held under court authority.
Even if your company has internal policies forbidding the sharing of proprietary or sensitive data, any accidental or uninformed submission is now preserved indefinitely. This includes trade secrets, legal inquiries, financial data, health-related information, and other protected categories. These submissions are no longer just internal issues. They may become part of a litigation archive beyond your control.
This presents serious challenges for legal compliance, data governance, and risk management. Companies must now proactively manage the use of AI across their workforce and supply chain. They should also review vendor contracts, privacy policies, and technical safeguards with renewed attention and diligence.
When Privacy Meets Litigation
Data privacy laws, such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), grant users significant rights. These include the right to access, correct, and delete their data. But once data becomes relevant to litigation, courts can impose preservation obligations that override those privacy rights. This has always been the case; however, with AI in the picture, companies are sharing even more of their data with AI systems, which increases their data risks.
That is what has occurred here. The court has directed OpenAI to preserve all data that might be relevant to the case brought by The New York Times. In doing so, the order effectively renders OpenAI unable to comply with user deletion requests. OpenAI itself has voiced concern that this raises the likelihood of privacy violations and increases the risk of data breaches.
This scenario presents a troubling conflict between the principles of legal due process and data protection. When data collected by AI tools is drawn into litigation, users lose the ability to manage, control, or know where their information resides. This is especially concerning in cases where individuals are not parties to the lawsuit but are nonetheless swept up in the preservation mandate.
Businesses Must Reassess Their Risk
Companies that use generative AI tools, especially through API access, are now on notice. This court order shows that simply using an API does not shield an organization’s data from external retention or legal exposure. Any prompt submitted, even through automated or backend systems, can be preserved under court order without the company’s awareness or consent.
The consequences are severe. A business might find itself in violation of internal privacy policies, client contracts, or regulatory obligations simply because it submitted data into a third-party AI tool that is now frozen under litigation. Even organizations that believed they were minimizing risk could now face unintended legal or reputational fallout.
To respond, businesses must:
Train employees and contractors on what data should never be submitted to AI tools
Establish pre-approval workflows for AI use in sensitive areas
Audit existing interactions with AI vendors to identify potential risks
Update contracts with clear terms about data use, storage, and deletion
Monitor regulatory guidance on AI data retention and cross-border data flows
Governance must transition from theory to practice. That includes building safeguards into development environments, sales tools, HR systems, and customer engagement platforms that rely on AI.
No One Is Immune
The preservation order in New York Times v. OpenAI sends a powerful message. Whether you are a casual user, a developer, or a multinational business, your data is not beyond reach. Once entered into a generative AI system, your input can become part of a legal process that is completely outside your control.
Courts will not stop to ask whether you expected your data to be deleted. They will only ask whether the data might be relevant to a case. If the answer is yes, the platform is obligated to retain it, even if that creates direct conflicts with privacy regulations or customer promises.
This moment highlights the fundamental problem with relying on deletion as a means of safeguarding privacy. I have long said that AI systems are designed to remember, not to forget. That is not a flaw in the code, it is a core function of how these systems operate. And now, courts are enforcing that persistence in ways that users never anticipated.
Prioritizing Internal Controls and Data Discipline
Rather than waiting for regulators or courts to define boundaries, companies should adopt disciplined internal data practices now. This includes curating what information is allowed into AI systems in the first place. Businesses must treat every prompt or data entry as a potential record that could be retained, exposed, or discoverable.
Good data hygiene starts with defining clear policies. Who is authorized to use generative AI? What types of data are permitted? What business functions are off-limits? These questions should not be left to informal guidance or after-the-fact corrections. They should be codified into enforceable internal policies, with training, technical enforcement, and routine audits.
Organizations should also require AI vendors to offer usage transparency. What is being stored? For how long? Is it retrievable or exportable? Can organizations review and purge data associated with their account?
Just as businesses learned over time to restrict sensitive data from being emailed, stored in the cloud, or shared across insecure channels, they now need to build new muscle memory for AI usage. This is not a hypothetical compliance exercise. It is a material protection of intellectual property, trade secrets, customer trust, and legal exposure.
Moving Forward
For users, the lesson is clear. Never assume your interaction with an AI system is private or temporary. Do not share personal, medical, financial, or business-critical information unless you are willing for it to be stored indefinitely. For organizations, the stakes are even higher. You must now govern AI use with the same rigor you apply to legal, financial, and compliance systems.
The court’s preservation order is a cautionary tale. It confirms what I have warned about for years on stages around the world. The permanence of AI data is not just a technical design issue; it is also a legal and operational risk. If we fail to act by implementing strong internal controls, educating users, and being intentional about how we engage with AI, we may find ourselves exposed in ways that cannot be reversed. Taking the right next steps will help your organization make data privacy a business advantage.