The Right to Be Forgotten?

The European Court of Justice ruled in May 2014 that Google must honor requests not to report links about individuals. Reaction ranged from "there is a right to be forgotten" to "search engines will be inundated with requests."

Lost in all of the discussion is the impossibility of hiding anything on the Internet. Anyone can run their own Web crawler. I maintain several Web sites and daily see crawlers from all over the world accessing the machines.

Just recently I was investigating how hard it would be to write my own special-purpose crawler and saw a link to a project claiming their Web crawler is 600 lines of code (about two or three days of work). Apache Nutch (https://nutch.apache.org) is an open source Web crawler that is now 12 years old and is capable of large-scale data collection.

All the would-be search engine provider needs to do is rent some cloud computers. That could be a new service not yet in the sights of the European Court of Justice, or it could be one individual looking for information on another. It's easy to imagine a new open-source project that would mash together the Nutch crawler and a list of public information sources to provide the exact type of search that the European court wants to stop.

If you read the news carefully, you would see that the court is *not* asking that the information be removed from the Internet; that was deemed "too hard." And yet nothing short of that will succeed. The issue is that the information is published, not that Google (or anyone else) is finding it.

Nothing the court can say will prevent information like this from being read by an indexing service, or even an individual running his/her own targeted search. The indexing services cannot know in advance who would like to be "forgotten," and you can't prevent individuals from reading what is on the Internet, any more than you can prevent them from going to the courthouse to read printed records. It's easier than before, that is all.

In short, the court is commanding a flawed technical "solution" to a social problem, and this will fail. To hide the information, it must be removed. And that comes with its own problems - what should be removed, and who should decide?

The odd part here is that Google is not being asked to hide links to defamatory information (the usual reason for a takedown request) - it is being asked to hide the truth. Embarrassing, yes, but not defamatory.

The plaintiff claimed that the repayment of his debt was not given the same prominence as the notice of default. I see an opportunity for a technical solution - when a person's name is the query, group results for each individual by that name rather than whatever (seemingly) random ordering is used now. It should be possible to guess connections based on the date and location of the information.

If the search engines do that, then maybe - just maybe - people will stop thinking that I shot John Lennon. Until then, I can only wish that Mark Chapman (no relation - really!) didn't use his middle name too...

Chapman Consulting

Software Development Done Right.