• Your private telephone bills have been sold.
• Banks have let your personal banking records slip into the wrong hands and then taken months to let you know.
• Companies are creating consumer profiles without your knowledge.
• The Patriot Act has gone much too far.
By Stefanie Olsen
A renewed federal push for an antiporn law raises questions about just how effective filters are.
CNET News.com asked telecommunications and Internet companies about cooperation with the Bush administration's domestic eavesdropping scheme. We asked them: "Have you turned over information or opened up your networks to the NSA without being compelled by law?"
|Adelphia Communications||Declined comment|
|AOL Time Warner||No |
|Cable & Wireless*||No response|
|Charter Communications||No |
|Cingular Wireless||No |
|Citizens Communications||No response|
|Cogent Communications*||No |
|Level 3*||No response|
|NTT Communications*||Inconclusive |
|Qwest Communications||No |
|SAVVIS Communications*||No response|
|Sprint Nextel||No |
|T-Mobile USA||No |
|United Online||No response|
|Verizon Communications||Inconclusive |
|XO Communications*||No |
* = Not a company contacted by Rep. John Conyers.
Google's recent legal spat with the U.S. Department of Justice highlights not only what information search engines record about us but also the shortcomings in a federal law that's supposed to protect online privacy.
It's only a matter of time before other attorneys realize that a person's entire search history is available for the asking, and the subpoenas begin to fly. This could happen in civil lawsuits or criminal prosecutions.
That type of fishing expedition is not legally permitted for Web mail providers. But because search engines are not fully shielded by the 1986 Electronic Communications Privacy Act--concocted back in the era of CompuServe and bulletin board systems--their users don't enjoy the same level of privacy.
"Back then, providers were very different animals than they are now," says Paul Ohm, a former Justice Department attorney who teaches computer crime law at the University of Colorado at Boulder.
Two solutions are simple to describe, but not likely to happen. First, search engines could voluntarily--or be required by law to--delete search histories after a few months unless the customer objects. Second, federal law could be amended to make it clear that search engines, which serve as a window to the Internet, are fully protected.
CNET News.com has surveyed Google, Microsoft, Yahoo and AOL to find out their privacy practices, and assembled these answers to frequently asked questions.
Q: Does Google collect and record people's search terms whether they're logged in or not?
Yes. Google confirmed this week that it keeps and collates these results, which means the company can be forced to divulge them under court order. Whether Google does anything else with them is another issue.
Given the Department of Justice's recent subpoena to Google, it's likely the police or even lawyers in civil cases--divorce attorneys, employers in severance disputes--eventually will demand that Google, Microsoft, Yahoo, AOL, and other search engines cough up users' search histories.
Q: Has this happened before?
Almost. A North Carolina man was found guilty of murder in November in part because he Googled the words "neck," "snap," "break" and "hold" before his wife was killed. But those search terms were found on Robert Petrick's computer, not obtained from Google directly.
Also, attorneys have already begun introducing searches conducted on Google, Yahoo and AltaVista as evidence.
Q: When I use search engines, I type in a lot of search terms I consider private. What does this mean?
We go into all the details below. But the short answer is that when private companies collect reams of data all the time on nearly every American, and the government and curious attorneys can get to that with few obstacles, this becomes a problem. Search engines provide a look into people's personal lives, and privacy awareness has not kept pace.
Q: Aren't there any privacy laws that protect us?
Not really. There is a federal law called the Electronic Communications Privacy Act. But it was enacted in 1986, long before politicians knew about the Internet, and the wording doesn't prevent police and attorneys from targeting search engines.
Politicians wrote that law in a way that is technology-specific--one key part revolves around the meaning of the pre-Internet term "processing services"--instead of adopting a more flexible approach that would grow with technology. Some states may have laws that are more applicable.
Q: Why does Google store that information about me, anyway?
No law requires Google to delete it, and there are some business justifications for keeping it.
For instance, keeping detailed records can help in identifying click fraud (faking clicks on Web ads to drive up a rival's cost), and in optimizing search results for different geographic areas. Compiling a user profile can aid in tailoring search results in products like Google Personalized Search. Also, disk storage is cheap, and engineers tend to prefer to keep data rather than delete it.But it's hardly clear that a compelling reason exists for keeping older records--beyond a few months--unless a customer voluntarily chooses options like personalization.
Q: Does that mean Google has the technical ability to link a person's searches together and divulge them when legally required?
Yes. Google says in its FAQ that it records Internet address, date, time, browser type, operating system and a cookie ID.
Author and entrepreneur John Battelle received word from Google this week that the company can perform two important types of matches. (We confirmed this with Google and followed up with additional questions.)
First, given a number of search terms, Google can produce a list of people (identified by Internet address or cookie) who searched for a given term. Second, given a collection of Internet addresses, Google can produce a list of the terms searched by the user of a given address. That effectively creates an electronic dossier of an individual.
Q: What about other search engines?
We surveyed AOL, Microsoft and Yahoo as well. Microsoft and Yahoo gave us the same response as Google did.
AOL's was a little different. Spokesman Andrew Weinstein said AOL could provide a list of search terms typed in by a user. But AOL does not have a system in place to perform the opposite mapping, which would find out what users typed in which search terms. Weinstein also said that AOL deletes personally identifiable search data after 30 days, which makes it unique among the quartet we surveyed.
Q: What about links people click on from search engine results? Can that information be turned over too?
Yes. Through a process known as redirection, Yahoo and AOL record what links people click. Unless the companies discard these records, they would be fair game for a subpoena.
Q: Let's say the Bush administration wanted to obtain a list of the names or Internet addresses of anyone who typed "how to grow marijuana" or "how to cheat on income taxes" into Google. Could that be done?
Probably. If the Electronic Communications Privacy Act does not apply, all that's required is a subpoena from a prosecutor, and no prior approval from a judge is necessary. One Harvard law professor calls the subpoena power "akin to a blank check."
"The threshold rule is relevance," says Paul Ohm, the University of Colorado law professor. "Relevance has been quite broadly construed. As long as you can show that something's relevant to a case or criminal investigation, I think the litigant would have a pretty good argument."
Using the examples of finding out who did searches like "how to make meth" or "how to kill the president," Ohm says prosecutors "would have a very good argument that it's relevant to an investigation."
Q: How can I protect my privacy from search engines?
First, to protect your privacy if your computer is stolen, you can clear your browser's history (sometimes called "private data"). In Firefox, select that option from the Tools menu and delete your browsing history and saved form information. Apple Computer's Safari has a similar option under the History menu. Encrypting your hard drive through OS X's FileVault or PGP's Whole Disk Encryption may be a good idea.
Second, you can clear the cookies that are set by search engines. In Firefox, go to Preferences and select Privacy. You have the option to delete cookies and even prevent certain sites from ever setting them again. Be warned, though, that adding Google.com to the list may prevent using options like personalization or Gmail.
Danny Sullivan has posted a more extensive list of recommendations at SearchEngineWatch.com.
Q: Is Congress going to do anything?
Rep. Ed Markey, a Massachusetts Democrat, has pledged to introduce legislation to prevent storing search terms "beyond a reasonable period of time."
There are some political and practical problems with this approach. First, Markey is a liberal Democrat in a town controlled by Republicans, so his proposal isn't going anywhere. Second, any such law could be wildly disruptive--it could mean class-action lawyers would get rich suing tech companies on charges that their data-retention duration is not "reasonable."
Finally, it's hardly clear that the Bush administration will embrace such a proposal--search terms could prove useful in criminal prosecutions, and the Justice Department seems to like the ability to demand them from search engines.
Q: How are Internet addresses handed out? Do people always have the same one?
It depends. Many DSL and cable modem providers allocate Internet addresses only when they're in use (the methods are called DHCP and PPPoe). Those IP addresses can change frequently.
Other IP addresses tend to be fixed. Faculty and staff members at universities, and employees of corporations, are more likely to have fixed Internet addresses.
Q: If Google knows I'm connecting from a dynamically assigned Internet address of 188.8.131.52 one day, and 184.108.40.206 the next day and 220.127.116.11 the third, how can it link my queries together to create that dossier?
This is where "cookies" come in. A cookie is simply a device for a Web site to recognize people the next time they return. Google, Yahoo, AOL and Microsoft all set cookies by default. (Microsoft's expire in 2016; Yahoo's in 2010; Google's in 2038. AOL sets a third-party cookie that expires in 2011.)
In the above example, Google.com would set a cookie for whoever's connecting from Internet address 18.104.22.168 the first day, and then figure out that the same Web browser is connecting from 22.214.171.124 and 126.96.36.199 the next two days. If people are logged in to their Google account, this makes the process even easier, of course.
Q: Even if a search engine company knows my Internet address is 188.8.131.52, and links my previous searches together, how can they--or the government--get my name, home address or other information?
If you have a Google account for products like Gmail, Google Groups, Personalized Search or Google Alerts, Google knows your e-mail address and other personal information, which it can be forced to disclose. If a Web publisher signs up for Google AdSense for advertising revenue, Google will have the publisher's real name, mailing address and Social Security Number.
If a person doesn't use any other Google services, all the company can divulge in response to a subpoena is that person's Internet address. Then whoever's asking about the person will send a second subpoena to the person's Internet service provider to find out billing information. This is a relatively straightforward procedure used by the Recording Industry Association of America (RIAA) in thousands of file-swapping lawsuits.
Q: Has anyone ever sent search engines a subpoena or other kind of legal request for someone's search terms?
We don't know. Google and Yahoo refused to answer the question, though there is no law prohibiting them from doing so.
AOL said only that the Electronic Communications Privacy Act would apply. Microsoft was by far the most forthcoming. With the exception of the Justice Department subpoena for search terms (without user identities) last year, Microsoft said it has "not received either criminal or civil requests related to MSN Search data."
Microsoft also said it "has never received either criminal or civil requests" to produce the lists of people who typed in a search term. Oddly, the other companies were not nearly as open.
Q: How long do companies keep records of my search terms?
Microsoft, Google and Yahoo all said they keep data as long as it's necessary, which could mean forever. Microsoft did add that the company is "looking at ways" to provide users with the option to delete their search histories, and Yahoo made a similar statement.
AOL, on the other hand, says it deletes personally identifiable data after 30 days.
CNET News.com's Elinor Mills contributed to this report.
Protecting Your Search Privacy: A Flowchart To Tracks You Leave Behind
Wired's "How to Foil Search Engine Snoops" is a nice guide to protecting your search privacy, but it doesn't really go far enough. In particular, anyone who assumes they've protected themselves by using an anonymizing tool is probably not eliminating the important ISP aspect. Meanwhile, laws being considered to force search companies to destroy data must consider the role of ISPs to fully provide the intended protection.
In this piece, I'll take you step-by-step about how your search privacy data gets exposed from all the way from your desktop to the sites you visit. Let me make some caveats before I begin.
Normally with stuff like this, I like to do a "Big Story With Answers To All The Questions" type of piece. That's what I tried to do back in 2003, the last time search privacy really came up as an issue. Much of what I wrote then is still applicable to the issues today, and I'll be drawing on those pieces. You may wish to read them as well:
I definitely don't have all the answers to all the privacy questions in this piece, especially as privacy issues have gotten more complex. But I wanted to make a start, perhaps the beginning of a living document or future article that will provide all the answers. I'd especially invite those with additional tips, observations and so on to contribute to a Search Engine Watch Forum discussion on this topics -- the link will be at the end of the article.
Onward to the search privacy flowchart. It's not an illustrated one in the traditional sense, but it should give you an idea of all the traces you leave behind when searching for something.
In November, we wrote of a man convicted of killing his wife in part because authorities found he'd searched for "neck," "snap," "break" and "hold" on Google. But that information was not handed over by Google itself. Instead, it was found in traces left behind on the man's own computer.
Anything you do on the internet gets recorded on your own computer in various ways. Pages you've visited are stored in your computer's cache, and a history of the URLs you've seen and things you've searched for may also get stored in your browser.
Clearing Your Search History From Google And Other Search Engines from me in 2003 covers some of the ways to delete what you've looked for in Internet Explorer 5, much of which is applicable to Internet Explorer 6.
How do I delete the drop-down list of my past searches? over at Google looks to be a very comprehensive guide on clearing out any search history that appears in the search box on the Google home page.
That information is NOT saved at Google. Instead, it's recorded within your own browser. The Google page gives instructions for cleaning out IE, Firefox, Safari and other browsers. Also, these same instructions should work to clear out your search history at all search engine in one go, not just at Google.
Unfortunately, there are so many search toolbars out there that they might keep their own histories independently of your browser. Google's does, and the page above from Google has instructions on clearing that out. MSN has instructions on clearing its toolbar history here. Instructions for Yahoo are here. For other tools, a first stop is to check the help pages for them.
Now that you've cleared out saved searches, you've still got URL histories and saved pages you might need to clear. How to clear your browser's cache and cover your tracks on the Web looks to be a pretty good article to guide you on how to delete this type of material. It also points to a number of software tools to make life easier. There's also more tools here, here and here from Download.com.
Software may be the way people need to go, as search gets more and more embedded into everything. Running any desktop search tools? They may be storing information you want to delete. For example, Google's desktop search tool also stores all the pages you view on the web. When I last looked, deleting your browser cache did not destroy the data Google Desktop itself keeps.
Managed to wiped everything out either manually or with software? Now go wipe out your hard drive. That's because even if you delete files, people with the right tools and knowledge might still be able to bring back the data. Some of the tools mentioned above may be able to make this easier so that something you've deleted really stays deleted. But the most surefire way to do so would be to physically destroy your computer's hard drive, literally prying out the metal platter where the info is recorded and ideally breaking it up into multiple parts that would be disposed of in various places.
Back to reality, most people aren't going to do that. But I'm trying to underscore how difficult it is to absolutely protect your privacy from prying eyes right on your own computer.
For those worried that tips like cleaning search history from your desktop is helping potential wrongdoers, keep in mind that there are plenty of innocent reasons for wanting to clear search information. For example, a neighbor's older son had looked up porn on their computer. My neighbor could not figure out how to get rid of the pornographic search terms that kept appearing in the search drop down box that his younger daughter was seeing.
The weakest link in protecting your search privacy is your ISP. Everything you do is going to flow out of your computer and through your ISP to a search engine. Your ISP will see the pages you are requesting and in all likelihood have some type of records of what you've done for a set period of time. Whatever deletions you do on your own computer -- plus whatever things you do to be anonymous with search engines -- these have no impact on your ISP. It sees all.
EarthLink has security measures in place to protect the loss, misuse, and alteration of the information under our control. While we make every effort to ensure the integrity and security of our network and systems, we cannot guarantee that our security measures will prevent third-party “hackers” from illegally obtaining this information. We will never sell your information to a third party.
How long are records of what you've visited kept? Do these records exist at all? How might they be shared with others? Answers aren't provided.
Back in June, I wrote of a Reuters article (no longer at Reuters, but there's copy here) that cited one analyst saying that most ISPs don't keep data for longer than a month. In Europe, governments themselves apparently mandate a one to three year retention of data, according to a News.com article from last year. Ironically, while the current US government request for search data has at least one lawmaker considering whether search engines should destroy data, that News.com article says the US government seeks to force ISPs to keep data longer.
By the way, even if your ISP deletes data, you'd better make sure they are forcing companies that mine their data to do the same. Better Search Privacy Needs Addressing Overall from me covers how third party companies such as Hitwise take in ISP data as a way to track what people are doing on the internet.
Visit a major search engine, and it keeps track of every request you make. It will also assign you a cookie, unless you reject these. That's easy enough to do, and the Wired article gives you some tips on that.
Rejecting cookies still leaves behind your internet address. My Search Privacy At Google & Other Search Engines article and the other one I've just posted, Private Searches Versus Personally Identifiable Searches explains this a bit more. Basically, it links your request back to your ISP and thus still back to you, if someone has access to your ISP.
The Wired article suggests using an anonymizing tool to avoid this. Anonymizer is a long-standing one. However, most anonymizing tools only prevent sites you visit from seeing your real internet address. They don't prevent your ISP from seeing where you are going.
I learned of the Tor anonymizing service through the Wired article. It's not clear to me whether that prevents the ISP tracing, as well, Talking with Dave Naylor, a search marketer who also runs his own ISP, your activity would be hidden from your ISP only if Tor keeps all information you send encrypted between your computer and the Tor servers you tap into.
Ethan Zuckerman (author of A technical guide to anonymous blogging - a very early draft) has a nice post about using Tor over here, but it doesn't seem to address the ISP question.
Let's flip things around and say you are NOT worried about visiting your favorite search engine and staying anonymous. In fact, you've decided to embrace the search history features they offer, which frankly can be really useful. Google's, for example, I find does a good job of improving my results based on pages I've visited.
All the major search engines embed the search terms you used into the URL that appears in the address field of your browser. When you click on a listing, that URL is sent as "referrer" information to the web site you go to. That means what you searched on is sent to the web site you ultimately visit from a search engine. They're able to know the search terms you used plus your IP address.
Referrer information is precious data to web sites. It allows them to know exactly how people found them. As a search marketer, I'd hate to see this information go away. But it is a privacy issue to be aware of.
Many web sites make use of third party analytic services, such as ClickTracks, WebSideStory, WebTrends or Google Analytics. That means these services are almost like clearinghouses of search data. They see what many people are searching for -- and clicking on -- from all over the web through the data from thousands of clients using them. Potentially, they are just as rich a target for any government agency to mine as the search engines themselves.
To protect yourself, you want to ensure your browser doesn't pass along referral information. In Internet Explorer, I see no native way to do this. You'll have to turn to products like Norton Security or the tool I use and much prefer, ZoneAlarm. There are certainly other third party tools out there. For Firefox, there's at least one extension you can try.
As you can see, ensuring your search privacy is tricky. The information you send is leaving traces in multiple places. The solution to ensuring privacy isn't going to be as easy as passing a law that targets Google, Yahoo and the others. Ideally, the entire lifecycle of a search beyond the computer desktop needs to be considered from ISP through to tracking services. Searchers themselves also need to consider what they do on their own computer desktops.
There's also an issue of what should be private. I wrote earlier today that most people probably think the conversations they have with search engines as being private. But to date, we don't have any protected searcher-search engine relationship as we do with attorney-client privilege or between clergy and worshipper. Perhaps that needs to be enshrined in some way. But then again, others may feel that going out on to the public web and using publicly accessible search engines entitles no one to an expectation of privacy, or perhaps a more limited one.
Certainly, we need to have a good debate and discussion. That's probably the good that's coming out of the Department Of Justice action. After years of worrying about privacy issues, the DOJ action is turning that worry into action about better protections that may need to be put into place.
Let me add that while I hate the sloppy manner in how the DOJ has acted in this particular case, I have no more interest in criminals using the internet for bad purposes than most people would. In specific circumstances, with the right legal oversight, I hope search or internet browsing data might be evidence that helps catch a criminal, just as I hope they'd be caught through legally approved wiretapping or other types of law enforcement monitoring.
What I don't want is a Big Brother state to be mining everything with the assumption we're all criminals, any more than I want all telephone calls to be monitored. Moreover, it's very, very easy to mistakenly assume from a search request that something wrong is happening, when it is not. Jon Swift takes a light-hearted look at this in his post today, but it's true. A search for "bombing the white house" doesn't mean someone's planning to do that. It may simply be that you're trying to find out about someone who may have attempted this.
Aside from the government issue, there's the concern that the search companies themselves might misuse data. That needs to be considered and improved guidelines or laws developed. Even better would be to see such moves as part of improved protection of consumer information of all types. The amount of data about what people personally are interested in and do seems easier to obtain from consumer research organizations right now than what search engines possibly might provide in the future. How about considering these both together, rather than separately, an idea that came up in a Newsfactor article on Google and consumer data in general last year.
For more the current issue between the Department Of Justice request for search data, please see these articles from us and others:
Want to comment on things discussed in this article? We have three Search Engine Watch Forum threads where everyone is welcome:
Postscript: Anonymizer tells me that if you are using only the IP hiding function in Anonymizer, then your ISP will see what you are doing. However, if you use the SSL encrypted "Surfing Security," then your ISP cannot see what you are doing. They're using a better metaphor for this now, calling it an "virtual tunnel" between you to the Anonymizer servers. Ah, but what records does Anonymizer itself keep? None, the company tells me:
The way that the technology is architected, it does not retain any information about users' requests so even if subpoenaed, no information can be supplied because—simply—they do not keep any of it. For example, they would not be able to share with anyone where a user is by IP address, or what sites they visited, or anything else, because even Anonymizer does not know. Additionally, the company provides software for use in instances where a privacy breech might have severe consequences—even death in some cases (where the company protects freedom of speech in foreign countries, Anonymous tips, etc.). Anonymizer has never had a single breech since it began selling products and services in ’97, due to its level of security. Trust is a key difference.