"We kill people based on metadata." - National Security Agency Gen. Michael Hayden
Raw data with time-series significance is their absolute favorite. You might argue something like Google Maps data is "obfuscated by virtue of its banality" until you catch the right person in the wrong place. ChatGPT sessions are the same way, and it's going to be fed into aggregate surveillance systems in the way modern telecom and advertiser data is.
This is mostly security theater, and generally not worth the lift when you consider the steps needed to unlock the value of that data in the context of investigations.
-The Privacy and Civil Liberties Oversight Board’s 2014 review of the NSA “Section 215” phone-record program found no instance in which the dragnet produced a counter-terror lead that couldn’t have been obtained with targeted subpoenas.
https://en.m.wikipedia.org/wiki/Privacy_and_Civil_Liberties_...
-After Boston, Paris, Manchester, and other attacks, post-mortems showed the perpetrators were already in government databases. Analysts simply didn’t connect the dots amid the flood of benign hits.
https://www.newyorker.com/magazine/2015/01/26/whole-haystack
-Independent tallies suggest dozens of civilians killed for every intended high-value target in Yemen and Pakistan, largely because metadata mis-identifies phones that change pockets.
https://committees.parliament.uk/writtenevidence/36962/pdf
Raw data with time-series significance is their absolute favorite. You might argue something like Google Maps data is "obfuscated by virtue of its banality" until you catch the right person in the wrong place. ChatGPT sessions are the same way, and it's going to be fed into aggregate surveillance systems in the way modern telecom and advertiser data is.