Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think the court overstepped by ordering OpenAI to save all user chats. Private conversations with AI should be protected - people have a reasonable expectation that deleted chats stay deleted, and knowing everything is preserved will chill free expression. Congress needs to write clear rules about what companies can and can't do with our data when we use AI. But honestly, I don't have much faith that Congress can get their act together to pass anything useful, even when it's obvious and most people would support it.


Why is AI special in this regard? Why is my exchange with ChatGPT any more privileged than my DuckDuckGo search for _HIV test margin of error_?


You're right, it's not special.

This is from DuckDuckGo's privacy policy: "We don’t track you. That’s our Privacy Policy in a nutshell. We don’t save or share your search or browsing history when you search on DuckDuckGo or use our apps and extensions."

If the court compelled DuckDuckGo to log all searches, I would be equally concerned.


That's a pretty significant difference, though.

OpenAI (and other services) log and preserve your interactions, in order to either improve their service or to provide features to you (e.g., your chat history, personalized answers, etc., from OpenAI). If a court says "preserve all your user interaction logs," they exist and need to be preserved.

DDG explicitly does not track you or retain any data about your usage. If a court says "preserve all your users interaction logs," there is nothing to be preserved.

It is a very different thing - and a much higher bar - for a court to say "write code to begin logging user interaction data and then preserve those logs."


OpenAI also claims to delete logs after 30 days if you've deleted them. Anything that you've deleted but hasn't been processed by OpenAI yet will now be open to introspection by the court.


I should have said "web search", as that's really what I meant -- DDG was just a convenient counterexample.


DuckDuckGo uses Bing.

It would be interesting to know how much Microsoft logs or tracks.


AI is not special and that's the exact issue. The court made a precedence here. If OpenAI can be ordered to preserve all the logs, then DuckDuckGo can face the same issue even if they don't want to do that.


People upload about 100x more information about themselves to ChatGPT than search engines.


How did the court overstep? Orders to preserve evidence are routine in civil cases. Customer expectations about privacy have zero legal relevance.


Sure, preservation orders are routine - but this would be like ordering phone companies to record ALL calls just in case some might become evidence later. There's a huge difference between preserving specific communications in a targeted case and mass surveillance of every private conversation. The government shouldn't have that kind of blanket power over private communications.


> but this would be like ordering phone companies to record ALL calls just in case some might become evidence later

That's not a good analogy. They're ordered to preserve records they would otherwise delete, not create records they wouldn't otherwise have.


They are requiring OpenAI to log API calls that would otherwise not be logged. I trust when OpenAI says they will not log or train on my sensitive business API calls. I trust them less to guard and protect logs of those API calls.


Change calls to text messages. The important thing is the keeping records of things unrelated to an open case which affect millions of people's privacy.


I mean to be fair it is related to a current open case but the order is pretty ridiculous on its surface. It's feels different when the company and the employees thereof have to retain their own comms and documents, and that company must do the same for 3rd parties who are related but not actually involved in the lawsuit is a bit of a stretch.

Why the NYT cares about a random ChatGPT user bypassing their paywall when an archive.ph link is posted on every thread is beyond me.


> I mean to be fair


No its pretty good. To refine it further, its why you put a single user under scrutiny on litigation hold rather than the whole exchange server.


No, it wouldn't be like that at all. Phone companies and telephone calls are covered under a different legal regime so your analogy is invalid.


Consider the opposite prevailing, where I can legally protect my warez site simply by saying "sorry, the conversation where I sent them a copy of a Disney movie was private".


The legal situation you describe is a matter of impossibility and unrelated to the OpenAI case.

In the case of a warez site they would never have logged such a "conversation" to begin with. So if the court requested that they produce all such communications the warez site would simply declare that as, "Impossibility of Performance".

In the case of OpenAI the courts are demanding that they preserve all future communications from all their end users—regardless of whether or not those end users are parties (or even relevant) to the case. The court is literally demanding that they re-engineer their product to record all communications where none existed previously.

I'm not a lawyer but that seems like it would violate FRCP 26(b)(1) which covers "proportionality". Meaning: The effort required to record the evidence is not proportional relative to the value of the information sought.

Also—generally speaking—courts recognize that a party is not required to create new documents or re-engineer systems to satisfy a discovery request. Yet that is exactly what the court has requested of OpenAI.


If specific users are violating the law, then a court can and should order their data to be retained.


The preservation order feels like a blunt instrument in a situation that needs surgical precision


Would it be possible to comply with the order by anonymizing the data?

The court is after evidence that users use ChatGPT to bypass paywalls. Anonymizing the data in a way that makes it impossible to 1) pinpoint the users and 2) reconstruct the generic user conversation history would preserve privacy and allow OpenAI to comply in good faith with the order.

The fact that they are blaring sirens and hide behind the "we can't, think about users' privacy" feels akin to willingful negligence or that they know they have something to hide.


> feels akin to willingful negligence or that they know they have something to hide

Not at all; there is a presumption of innocence. Unless a given user is plausibly believed to be violating the law, there is no reason to search their data.


Anonymizing data is really hard and I'm not sure they'd be allowed to do it. I mean they're accused of deleting evidences, why would they be allowed to alter it ?


If it's possible evidence as part of a lawsuit, of course they can't delete it.


A targeted order is one thing, but this applies to ALL data. My data is not possible evidence as part of a lawsuit, unless you know something I don't know.


That’s… not how discovery works


The government's power to compel private companies to preserve citizens' communications needs clear limits. When the law is ambiguous about these boundaries, courts end up making policy decisions that should come from Congress. We need legislative clarity that defines exactly when and how government can access private digital communications, not case-by-case judicial expansion of government power.


My point is lawsuits make your data part of discovery retroactively. You aren’t being sued right now, but perhaps you will be.


Their point is that the discovery is asking for data of unrelated users. Necessarily so unless the claim is that all users who delete their chats are infringing.


Your point illustrates exactly why the tension between due process and privacy rights can't be fairly resolved by courts alone, since they have an inherent bias toward preserving their own discovery powers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: