You are being googled!

“We are moving to a Google that knows more about you.” Google CEO Eric Schmidt, February 9, 2005.

A lot of frightening trends happening at Internet. I could speak about the European Directive on Data Retention which forces phone operators and Internet service providers in EU member states to log all data related to phone calls and to Internet activities like writing mails and visiting web pages for 6 to 24 months. Or I could speak about search engine censorship and collaboration with Chinese authorities by Google, Microsoft and Yahoo. But actually I would like to talk about a less known subject: The data collection excess of the Google Empire which will be a huge time bomb for privacy of Internet users.

The Google Empire

Starting with the search engine as an student project back in 1996 Google invents new products almost every month. Beside the standard search Google offers specialized searches for images, videos, books, papers (called scholar), news, products (called Froogle), locations (maps), blogs, social networks, newsgroups, etc. Other products of the Google Empire are Google Earth, Google Desktop Search, Google Toolbar, Google Talk, Blogger, Gmail and AdSense/AdWord.
Some of this services yet provoked critics by Civil Liberty Organizations and privacy activists. The fact that Gmail analyzes all your mails and shows you ads related to the content of your mail provoked harsh reactions. Bigger protest had to face the Google Desktop Search. The goal of this tool is to make it as easy searching your computer and your private network as actually is to search the Internet. We have already seen how people are sometimes shocked to discover that personal information about themselves is out on the web and made easily accessible through search. The same issues apply in general to desktop search but shouldn’t be a privacy issue as long as ordinary security procedures are followed. Unfortunately, in the case of Google Desktop Search they often aren’t and that brought consumer watchdogs including the Electronic Frontier Foundation (EFF) to urge a boycott.
What is the critics about? It starts with the fact that the index of your computers content is transferred to Google and stored on their servers. “Google encrypts the data but also holds the key. In response to a subpoena, whether from the government or a private party, Google would have to use its key to unencrypt the data and turn it over in plaintext”, clarifies Cindy Cohn from EFF. Actually you can restrict what’s indexed to exclude sensitive data but you have to know where your data is stored and how the Windows file structure works. Unfortunately many people don’t know and that will make it happen that their tax accounting, emails, chat protocols and visited webpages are stored by Google. Everyone with access to your computer can quickly discover if you’ve got incriminating evidence on your computer.
Google is not only criticized for these services, actually with each new service they offer also the range of private information they collect from you is increasing. All this is done by a simple cookie.

Google cookies forever

Google stores an unique cookie for each browser on users computer. The cookie is submitted every time you do a search, visit a site using AdSense or visit a Blogger page. All this and the following information is stored in Google’s database: If you have a Gmail account, content and addresses of emails you send and receive will be stored. If you have an AdSense account also full name, address and bank account and the IP address of everybody who visits your pages with AdSense ads on them will be stored. And also your purchases in Froogle, your posts in Blogger and the information about your newsgroup activity in Google Groups will be stored. And if you use other utilities like Google Desktop or Google Toolbar for sure also this information is transmitted to Google.
To understand the danger of that data mining you must know how HTTP cookies work. They are stored on users computer and accessed and updated each time you enjoy one of the above services. They were designed for authentication and improve usability by maintaining user-specific information but they can also be used for user tracking. In the case of Google the cookies expire date is set to 2038 and updated every time it’s accessed. And that’s how Google gets an extensive profile of their users. And even though users can delete cookies on their computer they can’t access or delete their information kept by Google.

Data mining your identity

A look at Google Privacy Center makes it clear that Google makes extensive use of these information. The privacy policy explains that it keeps your data indefinitely, combines it with other services even from third parties and gives this information to whomever they want.
And that’s what thinks an known technology commentator from the BBC says about Google’s privacy: “Google builds up a detailed profile of your search terms over many years. Google probably knew when you last thought you were pregnant, what diseases your children have had, and who your divorce lawyer is.”
The collection of user information is not unique to Google. Other search providers like Yahoo and Microsoft as well as large shopping sites as Amazon and Ebay apply similar policies. The difference makes the ubiquity of Google’s search engine, AdSense technology, Gmail, Desktop Search and the other described services giving it a unique wealth of information.

Google as data kiosk for the US-government

Civil Rights organizations as the Electronic Frontier Foundation warn about the huge masses of data which search engines collect. This data could awake desires from criminal data thieves and from state authorities. Spammers would pay a lot for Gmail data and in combination with other Google databases as AdSense and search histories they could get a lot of personal information about Google users. But the bigger danger comes from US-state authorities. They are happy about the fact that Google collects the search terms, date and time, IP-addresses and browser configuration of more than 200 million search requests every day. And they get more happy of the fact that Google knows who communicates with whom and about what, what the same user blogs, about what he chats with other users, which news and books he reads, which ads he clicks on and which products he shops for. All this information is stored on the webservers of a private company only protected by his “Don’t be evil” codex and the soft US data protection laws. With the amount of information Google has of the private live of his clients it gets each time a more important tool for investigation of the police and secret services. A Google spokesman announced that they will inform users who’s data has been given to civil processes. But the danger lays in data requested by criminal proceedings and data requested under anti terror laws. So it’s prohibited under the laws of USA Patriot Act and penalized with prison to companies to inform their clients about any information requested by state authorities. How often is this happening in times after the 11S?

“Don’t be evil”?

Google’s corporate motto “Don’t be evil” gets an ambiguous taste in times where the elites are trying to reduce the state of the web as an important information tool for the masses. That’s why Google Watch and over 500 others nominated Google for a Big Brother award in 2003. The nine points they raised in connection with this nomination necessarily focused on privacy issues are also a resume of my conference about Google:

1. Google’s immortal cookie:
Google was the first search engine to use a cookie that expires in 2038. This was at a time when federal websites were prohibited from using persistent cookies altogether. Now it’s years later, and immortal cookies are commonplace among search engines; Google set the standard because no one bothered to challenge them. This cookie places a unique ID number on your hard disk. Anytime you land on a Google page, you get a Google cookie if you don’t already have one. If you have one, they read and record your unique ID number.

2. Google records everything they can:
For all searches they record the cookie ID, your Internet IP address, the time and date, your search terms, and your browser configuration. Increasingly, Google is customizing results based on your IP number. This is referred to in the industry as “IP delivery based on geolocation.”

3. Google retains all data indefinitely:
Google has no data retention policies. There is evidence that they are able to easily access all the user information they collect and save.

4. Google won’t say why they need this data:
Inquiries to Google about their privacy policies are ignored. When the New York Times (2002-11-28) asked Sergey Brin about whether Google ever gets subpoenaed for this information, he had no comment.

5. Google hires spooks:
Matt Cutts, a key Google engineer, used to work for the National Security Agency. Google wants to hire more people with security clearances, so that they can peddle their corporate assets to the spooks in Washington.

6. Google’s toolbar is spyware:
With the advanced features enabled, Google’s free toolbar for Explorer phones home with every page you surf, and yes, it reads your cookie too. Their privacy policy confesses this, but that’s only because Alexa lost a class-action lawsuit when their toolbar did the same thing, and their privacy policy failed to explain this. Worse yet, Google’s toolbar updates to new versions quietly, and without asking. This means that if you have the toolbar installed, Google essentially has complete access to your hard disk every time you connect to Google (which is many times a day). Most software vendors, and even Microsoft, ask if you’d like an updated version. But not Google. Any software that updates automatically presents a massive security risk.

7. Google’s cache copy is illegal:
Judging from Ninth Circuit precedent on the application of U.S. copyright laws to the Internet, Google’s cache copy appears to be illegal. The only way a webmaster can avoid having his site cached on Google is to put a “noarchive” meta in the header of every page on his site. Surfers like the cache, but webmasters don’t. Many webmasters have deleted questionable material from their sites, only to discover later that the problem pages live merrily on in Google’s cache. The cache copy should be “opt-in” for webmasters, not “opt-out.”

8. Google is not your friend:
By now Google enjoys a 75 percent monopoly for all external referrals to most websites. Webmasters cannot avoid seeking Google’s approval these days, assuming they want to increase traffic to their site. If they try to take advantage of some of the known weaknesses in Google’s semi-secret algorithms, they may find themselves penalized by Google, and their traffic disappears. There are no detailed, published standards issued by Google, and there is no appeal process for penalized sites. Google is completely unaccountable. Most of the time Google doesn’t even answer email from webmasters.

9.Google is a privacy time bomb:
With 200 million searches per day, most from outside the U.S., Google amounts to a privacy disaster waiting to happen. Those newly-commissioned data-mining bureaucrats in Washington can only dream about the sort of slick efficiency that Google has already achieved.

This conference was exposed at FILE – Electronic Language International Festival at Rio/Brasil in March 2006.

2 Responses to “You are being googled!”

  1. geraldo Says:

    At the conference I was asked about art projects related to Google. Actually the web was flooded by art projects related to Google. Here a small list of some interesting projects:

    Google Will Eat Itself: Web project by Hans Bernhard and Alessandro Ludovico which aims to buy Google with funds generated via Adsense: http://www.gwei.org/

    Googlegrams: Joan Fontcuberta uses Google to create large photo-mosaics that comment on the internet-era’s liaisons between mass media and our collective consciousness: http://www.zabriskiegallery.com/fontcuberta2006/images.htm

    Montage-a-google: Uses Google’s image search to generate a large gridded montage of images based on keywords entered by the user: http://grant.robinson.name/projects/montage-a-google/

    Searchscapes: Juliana Yamashita tries to build a 3D map of Manhattan using existing data from the web, comparing representations of the city’s “physical spaces” and “information spaces”: http://www.searchscapes.net/

    Buzztracker: Craig Mod software has been mining Google News for over one year and maps relationships between geographic locations mentioned in articles: http://www.buzztracker.org/

    newsmap: Treemap of Google news: http://www.marumushi.com/apps/newsmap/newsmap.cfm

    The web as a living organism: This net.art installation is trying to examine the relationship between internet users, information and the internet itself in the same way we look at the relationship between blood cells, Oxygen and the body itself: http://www.isaliving.org/

    Hermenetka: Lucia Leão uses Google search engine results to build a cartography of keywords related to the Mediterranean: http://hermenetka.org/

    Casual: A non-hierarchical conceptual landscapes generator based on Google’s text and image search: http://nualart.com/casual/casual.cgi

    toogle: Toogle generates ascii images of Googles Image Search: http://c6.org/toogle/

  2. Noreen camara Says:

    Is there software that protects you from anyone googling and finding my emails?

Leave a Reply