Effective CSAM filters are impossible because what CSAM is depends on context

Created
Tue, 02/07/2024 - 11:21
Updated
Tue, 02/07/2024 - 11:21

Automatically tagging or filtering child sexual exploitation materials (CSAM) cannot be effective and preserve privacy at the same time, regardless of what kind of tech one throws at it. Because what is and what is not CSAM is highly dependent on context.

Literally the same photo, bit-by-bit identical, can be an innocent memorabilia when sent between family members, and a case of CSAM if shared on a child porn group.

The information necessary to tell whether or not it is CSAM is not available in the file being shared. It is impossible to tell it apart by any kind of technical means based on the file alone. The current debate about filtering child sexual exploitation materials (CSAM) on end-to-end encrypted messaging services, like all previous such debates (of which there were many), mostly ignores this basic point.

All the tech in the world

Whenever CSAM and filtering of it become a topic of public debate, there is always a bunch of technical tools being peddled as a “solution” to it. These might involve hashing algorithms, machine learning systems (or as the marketing department would call them, “AI”), or whatever else.

And whenever this happens, technical experts spend countless hours verifying the extraordinary claims made by vendors and promoters of these tools, inevitably finding no proof of their purported effectiveness meant to justify their broad deployment to scan everyone’s communication.

But that’s not my point today.

My point today is that even if we had a magical tool that can tell – with perfect 100% accuracy – that a given image or video contains a depiction of a naked child, or that can – with perfect 100% accuracy – identify everything depicted in that media file, it would still not be enough to be able to filter out CSAM accurately and without mis-labeling huge numbers of innocent people’s messages as CSAM.

Context is king

This is because, as already mentioned, the information necessary to say whether a given image or video is or is not CSAM is in most cases not available in the file itself.

A video of a naked toddler, sent between two parents or other close family members, is just innocent family memorabilia. A nude photo of a child sent by a parent to a family doctor can be safely assumed to be a request for medical advice. Explicit materials sent consensually between infatuated teens exploring their sexuality are their business, and their alone.

But should any of these same images or videos leak from a compromised phone or e-mail account, and end up in a context of, say, a child porn sharing ring, they immediately become child sexual exploitation materials. Nothing changed about the media files themselves, and yet everything changed about their classification.

Not just the file

So, to establish whether a given media file is CSAM or not, whatever magical technological tool being used has to have access to the context. The media file alone won’t do.

This means information on who is talking to whom, when, and what about. What is their relation (close relatives? family doctor? dating?). Contents of their other communication, not just images or videos sent between them. This would have to include historical conversations, as the necessary context might not be obvious from messages immediately surrounding the shared file.

There is simply no way around it, and anyone who claims otherwise is either lying, or has no idea what they are talking about.

Importantly, this is not related to any limitations of our current technology. No amount of technological magic could squeeze any information from a media file that is not available in it. And there is always going to be pertinent contextual information that is not going to be contained in any such file.

The problem is real, the “tech solutions” are not

This is not to say that access to all that context is enough to “solve” CSAM. It is not, there are plenty of other problems with any CSAM-filtering proposal and tool out there.

This is also not to say that CSAM – and more broadly, sexual exploitation of children – is not a serious problem. It sadly absolutely is!

But it is not limited to the Internet. It is not going to be solved by a magical filter on our private, intimate communications, even if we could technically build a filter with sufficient accuracy (which we cannot).

If politicians wanted to be serious about solving the problem of sexual exploitation of children they would stop wasting their (and everybody else’s) time and energy on wishful thinking, misdirection, and technosolutionism.

And instead focus all that effort on programs that can actually do something about the problem.