Links - February 26th
Large-Scale Online Deanonymization with LLMs
TL;DR: We show that LLM agents can figure out who you are from your anonymous online posts. Across Hacker News, Reddit, LinkedIn, and anonymized interview transcripts, our method identifies users with high precision – and scales to tens of thousands of candidates.
Their methods are probably too compute intensive to be practical, but I think the writing is on the wall.
FTC encourages age verification under COPPA
“Age verification technologies are some of the most child-protective technologies to emerge in decades,” said Christopher Mufarrige, Director of the FTC’s Bureau of Consumer Protection. “Our statement incentivizes operators to use these innovative tools, empowering parents to protect their children online.”
The FTC and European regulators have been pushing age verification hard lately. Discord is in the news, but I suspect we'll see other companies adding age verification this year.
France is testing a GDPR audit tool for AI models
The GDPR applies, in many cases, to AI models trained on personal data, due to their capabilities to memorization. He also specifies that it is very often necessary to demonstrate in an analysis that a model trained on personal data resists information extraction tests on the training data in order to conclude that it is anonymous, a condition allowing its use to be removed from the scope of the GDPR.
The tool will try to adversarially extract personal information from LLMs - and I suspect it will have an easy time doing that. Making LLMs unlearn some piece of personal data isn't something we currently know how to do, so this could have a huge impact on LLM offerings in the EU.