Hacker News new | past | comments | ask | show | jobs | submit login
Where does Google actually say that they won’t read Gmail or Google Docs? (law.harvard.edu)
128 points by espeed on Nov 3, 2011 | hide | past | favorite | 90 comments



the tl:dr of this comment thread is that most HN'ers have sadly taken the question too literally and are commenting on the fact Google reads emails with a definition of "read" being that of "parse" (ie computationally). I think most people know and expect they obviously do that, for advertising targeting etc. I can't believe people are shocked or surprised at that.

Which is a shame because I think it's pretty clear the OP is actually asking about Google employees reading your email, which is a valid question but sadly lost in the fight over semantics. :/


> Which is a shame because I think it's pretty clear the OP is actually asking about Google employees reading your email

Unless you run your own smtp and pop/imap mail servers, there is no way to be sure no one will read your emails. Even then, your servers could get hacked or maybe your contacts use a hosted solution (and most people do) that could get hacked, there is again no guaranty. The only work around is to fully encrypt your emails, even then someone could steal your private keys, or you could be using an unpatched encryption software etc.

It's a matter of trust and time saving. Do you trust google, yahoo employees etc to not spy on you more than you trust your ability and your contacts ability not to get your mail servers hacked? How much of your time and money are you ready to waste maintaining such a server?


Unless you run your own smtp and pop/imap mail servers, there is no way to be sure no one will read your emails.

I think you really want to encrypt your email for that kind of peace of mind. Of course you need buy in from your contacts.


Actually there is another option which is to store emails on a secure webserver instead of sending them, and send "you've got mail" notifications instead. There are some usability issues such as message threading and what not, but technically this is much simpler to set up than the conventional encrypted email.


Except that Lucent have no real reason to put firmware in a fiber switch to read your email as it goes down the tubes.

A company that makes all it's money from targetted ads - might



Did you even read parent's post? We all agree that google (and yahoo, and hotmail) parse our emails to offer better ads and to index them for search. What we're talking about is bad employees reading your emails for their own curiosity.


This is getting off-topic, but did you forget about AT&T's secret rooms that were recording internet traffic? It's not just Google that wants to read your mail. Governments want to spy on their citizens, and ISPs are (allegedly) cooperating in the effort.


sure but that's basically a "constant" in that it's going to happen regardless of where you receive your emails.


The real question is not whether google, dropbox or anyone else has explicit "policies" about reading content from users. They're just going to say whatever makes them look good from a marketing and legal perspective. Companies lie all the time, systematically and comprehensively. They act in their own self-interest period. From their point of view it is merely a question of how much they can get away with.

The more interesting question to explore is how much we actually trust these companies with our personal information and how do we decide when we reach a limit to our trust.

Do I believe that google can read my gmail? Yes. Do I think this capability will cause harm to come to me as a result? Hell no. There's nothing "in it" for them to do so for me or millions of others. I "trust" them for this reason.


Speaking of trusting google... there was a situation at a company I worked at where the management strongly was against transferring the company mail system to google apps/mail because of the concern of google may read their email. They wanted to keep their internal mail infrastructure for privacy reasons of the company.

The cto at the time convinced them that there is really no point of keeping things inhouse since the majority of the management, for convenience, forwarded all their company email to their personal gmail account anyways. Funny how security is not about what you think but what you do.


Until there is something in it for them.

It's a pretty classic security question, sometimes usability means that you accept zero security and completely share your private correspondence with the world.

I think trust is the wrong word, you are really just accepting the risk of sharing all your data for the benefits because the risk is low.


Yes, I agree "trust" is the wrong word. It is perhaps better to think of it as a trade-off of some risk for some convenience and that's OK.

In any case, back to the OP's question, the stated "policy" about privacy means virtually nothing unless there exists a third party who can effectively verify that policy.


Yes, in my experience on either side of those "policies" they really do mean nothing.

You just have to assume that all the data you share is no longer yours and will be used however the company sees fit.

I am amazed by the number of people who assume that "that would be bad PR if they got caught" is good enough security to protect data they consider sensitive. Especially when the PR damage historically has been very low, especially if you are a sexy and loved company like facebook or google.


Imagine you're a startup and use gmail or Google Apps for your team communications, even customer service. One day Google decides it wants to acquire you. During negotiations, it has access to all email exchanges between you and your customers and, more importantly, you and your investors / team members. So there's definitely something "in it" for them to do so.


If you had strong evidence that Google was spying on you (especially for such anti-competitive reasons), you'd have them over a barrel a lot more than vice-versa.


Well, I don't. That's how convenient this arrangement is :)


Even if you trust a company not to do you harm with the information they store, there exist entities that can compel them to turn this information over. For example, the information can be (sometimes secretly) subpoenaed by the U.S. government.

The choice between being evil and not being evil can be an easier one to make, practically speaking, than the choice between being evil and becoming the target of federal criminal prosecution.


What are the chances that some random hosting company won't just do the same thing if you ran all of your services on a VPS or colo box?


Exactly. And that applies to any company not just "evil" ones. It is a bit naive to trust something as abstract as an organization. Trust applies to individual people, and a company is bound to be a very diverse group of people with different agendas.

If you want your mail to be reasonably safe from third-party reading, the only solution is to encrypt it before sending, and ask people that mail you to do the same. Anything that relies on trust is a half-assed solution.


In theory, yes of course, you can encrypt your communications personally and "figure out" on a case by case basis how to do key exchange with the end-party. But that is a MAJOR OBSTACLE for all but the most patient and tech-savvy people and totally overkill for all but the most critical life-or-death information exchanges.

In practice, unless both you and your recipient are operating your own email servers and key-exchange/encryption services, you HAVE TO "trust" a third party with your private information.


The point is that if people send their mail encrypted it is encrypted both in transit and when it is stored on the mail server. So the third party can be anyone, and you don't need to specifically trust them.

Also this isn't that big of an obstacle you make it to be. We're not living in 1995 anymore. A lot of mail clients have plugins or even have built-in support for encryption.


He answers his own question in the article. "https://mail.google.com/support/bin/answer.py?hl=en&answ... says that the answer to “Is Google reading my mail?” is “No” but doesn’t elaborate ..."

So, he's looking for a more elaborate answer? This is PhilG, so I should cut him some slack, but it seems pretty clear to me.


Misleading.

What that document actually says is this:

No, but automatic scanning and filtering technology is at the heart of Gmail. Gmail scans and processes all messages using fully automated systems in order to do useful and innovative stuff like filter spam, detect viruses and malware, show relevant ads, and develop and deliver new features across your Google experience. Priority Inbox, spell checking, forwarding, auto-responding, automatic saving and sorting, and converting URLs to clickable links are just a few of the many features that use this kind of automatic processing.

In other words, Google IS most certainly "reading" your email, at the very least with computers, for various purposes.


I think there is a basic disconnect.

There is no way for a computer to anything with an email (save it, give it to you etc) without 'reading' it's contents. So the only meaningful thing for people to worry about is decoding the content which IMO means a spam filter does not qualify as decoding. Nor does displaying adds for tires if the word cars shows up in your email.


So when you visit my website and I see Google pushing ads to you for Justin Bieber posters and pillowcases http://www.amazon.com/dp/B003FCXDRM/ we'll know Google isn't reading your Gmail.

... or are they? ;-)


Google does read your email for ads, and that fact is spelled out in their answer quoted above. AdWords/AdSense doesn't only apply to the content of the site you happen to be on, but is an aggregate of all they've learned about you over time. This means that data they collect from relevant keywords in an email is used to show relevant ads to you on a separate website.

I've never used AdSense so I don't know how it works. Is it possible for the site owner to read the ads that are displayed to you on a website? Does Google "leak" collected ad data to other websites which use AdSense? Or are these ads typically displayed in something like an iframe, where possibly the "host" website can't read them?


Google's Ads are served in an iframe.


I suspected as much, but as I've never used them I wasn't sure. Thanks for clarifying.


Google made a related statement last year when they fired an employee for unauthorized access to private data including emails.

"We carefully control the number of employees who have access to our systems, and we regularly upgrade our security controls–for example, we are significantly increasing the amount of time we spend auditing our logs to ensure those controls are effective. That said, a limited number of people will always need to access these systems if we are to operate them properly ..."

http://techcrunch.com/2010/09/14/google-engineer-spying-fire...


The same can be said about any IT department. One of my friends sent made the news papers over sending a racially shared joke through the email system to the "boys joke list" in a rather large and respected company.

Turns out some people in the IT department where also helping themselves to the list as a source of entertainment, one of them took extreme offense to the joke and reported it.

After the thing went through its investigation process it was uncovered that the majority of disliked staff in the department where infact reading everyones email. Most of them had set up notification filters to flag emails with there names to see what people where saying about them.

Now the question is how common is this considering how easy it is to do... and how the whole system more or else runs off trust?

I'd say if you have sensitive stuff going through your email that you don't want external parties to see, set up your own server and find someone who isn't petty to look after it.


For an IT department the expectation is that people will read your email. I honestly don't think what you describe is rare at all, based on stories I've heard.

And if you work for the gov't your email can often be requested by citizens via FOIA.


I was working for a government agency in New Zealand when they introduced a law that meant every email became a government communication and you couldn't delete anything. I usually operate on an empty inbox principal, it was a total nightmare for me. I folded that contract as soon as my term was up.


Moving it to another folder wouldn't have sufficed?


Yep, but when you get hundreds of emails a day, you end up folders inside folders. Not being able to trim excess is a total pain. Not to mention the pain of having limited mailbox sizes, so you have to ring the infrastructure vendor to archive your mail. And then put in requests to get older archives so you can search for stuff.


Has there been anyone working to make a comprehensive locally-hostable webmail/weboffice suite that replicates Google's functionality? I can imagine that this would be quite poular. Sure, some of the features like Priority Inbox might require complex heuristics, but things like labels should be easy to add support for.


Google used to sell hardware appliances, before they realized they would actually prefer to read all your documents.


How many individuals at google have the ability to read a particular person's personal data? If I'm a moderately important person[1], how many individuals, in or outside of google, will have access to any of that data over the life of it? And, how does google internally police against misuse? And, who watches the watchman?

[1] say, 1 in a thousand, of which google has 260,000 such gmail users


Probably about 50 altogether. Each of us promises to never ever look at personal data, before being granted privileges. And I'm pretty sure all those promises are serious.

I don't know how enforcement works, though a single offense would be firing-worthy.



From the comments:

> here’s a strange little story that happened to me a while ago – I set up a gmail account to deal with nigerian letters and such (I wanted to collect some data to report the spammers/thieves, without compromising my actual e-mail address in the process). I set this up with a fake username (something like george.thompson or so) and a password which included the word “nigeria” in it. Lo and behold, after my first login (before sending/receiving any mail) the targeted advertising in gmail included some nigerian ads (nigerian holidays, nigerian business bureau, etc). coincidence?….

If true, it seems they matched ads to the guy's password. Which means they needed to be able to read it plain text. The plain text should only ever live long enough to create or match with a hash.


Don't be too quick to judge - maybe he had been searching for Nigerian related things before creating the account.


You, and the other respondents, are probably right. Sometimes I'm too quick on the draw.


I seriously doubt google would persist any parts of passwords in plain text.


I think the experiment would only be illustrative if the computer had zero past Internet use (ie no cookies) and the ip address was brand new. Surely google tracks even if you don't have an account.


Could he have visited any Nigeria-related sites or performed any Nigeria-related searches?

He'd have to have a sterile browsing environment to ensure his ads weren't related to something else on Google's ad network.


More likely he was doing a bunch of searches about Nigerian Scams etc before/durig sign up and they matched it that way.


What about Facebook?

A few weeks ago I had a situation where Facebook contacted me about a job, and it appeared that Facebook may have been reading (or at the least mining) private messages related to my startup (http://news.ycombinator.com/item?id=3035376). A Facebook employee replied to the thread but wouldn't provide details.


(I am neither a Facebook employee nor can I speak for them.)

I really think that was just coincidence. Considering how many people talk about startups over Facebook messages and how many people have been contacted by Facebook recruiters, there's bound to be at least one person who was contacted by a Facebook recruiter a day or two after talking about startups. It'll seem mighty suspicious to that person, and they'll blog about it. Congratulations, you're the lucky one.

Companies that big just generally don't do things like read private e-mail, because they know it'll get out somehow - disgruntled employee, whistleblower - and the damage to their reputation is totally not worth whatever they can gain from it.


The thing people don't realize is that when contacted by a big company for a job, usually it's a bot that just sent hundreds of emails.

Receiving an email from Google or Facebook is nothing special.


What data sources does Facebook use to trigger the email?

And are emails manually approved before they go out? If so, does that mean a Facebook employee looks at the data source that triggered email, which in some cases may be the user's private messages?


It seems to be a large leap to go from "sends lots of emails" to "automates finding people to recruit." There is a similar leap from "automates finding people to recruit" to "uses private messages as signal."


philg's comment:

"Folks: The problem with Google promising to hold “personal information” confidential is that a document may not meet the definition of “personal information” given in http://www.google.com/intl/en/privacy/faq.html#toc-terms-per... . For example, if you write down some ideas for a new product and keep it as a Google Doc there may be nothing in the document that identifies you as the author and therefore it might be okay for Google to read or distribute the document.

Generally speaking the Google privacy docs address a separate issue from the question of keeping email or documents confidential. They are about the question of whether your identify is kept confidential when you’re browsing around the Web and Google is figuring out what your interests and demographics are. Worthy stuff to be writing about, no doubt, but it doesn’t shed much light on the subject of whether a Google employee can copy and paste paragraphs from your email messages or Google docs."

http://blogs.law.harvard.edu/philg/2011/11/03/where-does-goo...


Google should provide the option of automatically encrypting your incoming email with your public PGP key, if you provide one. They could similarly automatically encrypt your Sent folder (not the actual sent messages). That way, even if your account is compromised, or emails subpoenaed, they can't read the emails without your private key.

That's exactly what I do on my system anyway. I host my own email as well though: https://grepular.com/Automatically_Encrypting_all_Incoming_E...


How could they? They have ~24,000 employees and 260 million Gmail users.

http://www.quora.com/How-many-employees-does-Google-have and http://en.wikipedia.org/wiki/Gmail respectively.

(Although off the top of my head, they do say they will automatically scan all email for the purposes of advertising, spam filtering, etc. and the debate used to be "does this count as reading if no human is involved?").


That just proves that they can't read everybody's email.


It is probably pointless for Google to use humans to read mails, especially the mails of regular people. And it is certain that they use automated processing of the information in the mails.

A most relevant question would be: Does Google match the profiles of its mail users to high profile people, such as company managers or other rich people who can be easily found on, for example, Linkedin? If so, does Google automatically filter their email in order to predict, for example, the stock market?


I don't think terms provide too much assurance - maybe they're only really useful if legal recourse ensues.

There will always be the possibility of a rogue employee who goes out of his or her way to read data that doesn't belong to them.

In the case of Dropbox - to prevent this possibility - we can encrypt our data if we choose to (I've successfully used 'encfs' in the past). In the case of Google's email and document services I don't think this is possible?


Or use SpiderOak, Tarsnap, or Wuala, all of which handle encryption natively.


http://www.wired.com/threatlevel/2010/09/google-spy/

http://gawker.com/5637234/gcreep-google-engineer-stalked-tee...

Even if they say they won't it looks like they are lacking some internal controls.


Rogue employees are a fact of life. That they were caught shows that Google's internal controls are (at least reasonably) effective.


Right in their privacy policy (http://www.google.com/intl/en/privacy/privacy-policy.html):

We may collect the following types of information:

  <snip>
User communications – When you send email or other communications to Google, we may retain those communications in order to process your inquiries, respond to your requests and improve our services. When you send and receive SMS messages to or from one of our services that provides SMS functionality, we may collect and maintain information associated with those messages, such as the phone number, the wireless carrier associated with the phone number, the content of the message, and the date and time of the transaction. We may use your email address to communicate with you about our services.

  <snip>
In addition to the above, we may use the information we collect to:

Provide, maintain, protect, and improve our services (including advertising services) and develop new services; and Protect the rights or property of Google or our users.

If we use this information in a manner different than the purpose for which it was collected, then we will ask for your consent prior to such use.


Uh, that privacy policy (or at least the snippets you posted) provide NO explicit assurances that Google is not reading your stuff. At all.

It tells you what they WILL do. And it says if they decide to "use" (vague) "this information" (vague) in other ways, they'll ask for consent. Does simply reading it count as "using" it? Google doesn't say.

I wouldn't trust these assurances at all.


The section you've excerpted could be interpreted as saying that Google does reserve the right to have both its software and personnel read your email in order to "[p]rovide, maintain, protect, and improve our services (including advertising services) and develop new services; and Protect the rights or property of Google or our users."

Which is, more or less, for any reason whatsoever except those already blatantly illegal (harassment, blackmail, etc.).


I am more concerned about Google making an arbitrary decision to delete my Gmail account.


That's why I got my own domain. Coupled with offlineimap (or any other mail backuping software) running regularly, I can move from Gmail to any other provider in the time it takes to create a new account and update the MX records.


Does end to end email encryption get any traction? I remember PGP supposed to do it. Not sure how far that went.

It seems for company using third party email hosting, it shouldn't be too inconvenient to set up a company-wide encryption on its emails.


It's not google you have to fear.

It's the warrantless backdoors installed for most government entities.

What you think is harmless to discuss is probably being auto-indexed by them to use against you if you ever become a "problem" (ie. protest war, etc.)


Aren't emails subject to the legal definition of private correspondence and therefore protected from this kind of behaviour? (I believe this is the case in France, Italy, Spain or Germany)


Store your plaintext email on someone else's server. Act surprised it's not private.


what makes you think it's plaintext?


If it weren't, then it wouldn't be an issue. Right?

And what makes me think that is that I suspect 99.999% of email is in plaintext. I have no foundation for this. Just a gut feeling.



Aren't there some laws about secrecy of correspondence that already cover this?


No Google doesn't read your email. In fact their spam detection just works by flipping the worlds luckiest coin at every incoming email.


Where does it say on harvard.edu that they wont kill students and harvest their organs?

Edit: Oh sorry forgot that people in the land of the free don't have a human rights act preventing Google reading their email (not sure about harvesting organs)


The Gmail man reads your email and docs.

http://www.youtube.com/watch?v=yXqrTfOWx60


They're definitely reading my email.

I know this for a fact as the last time I mailed them they replied.


This should be covered by the statement "don't be evil"


If the FBI comes knocking on Google's door and says they have a terrorist under surveillance, and they need to read his email in order to save a busload of schoolchildren from a bomb, should Google let them do it?

Is it evil to read a bad person's email, or is it evil not to read it?

That's why a clearly stated policy that says exactly when Google will or will not let a human look at your documents and email is important. "Don't be evil" is cute, but it's not a policy statement.


If the FBI comes knocking on Google's door and says they have a terrorist under surveillance, and they need to read his email in order to save a busload of schoolchildren from a bomb, should Google let them do it?

From the Policy:

[When] We have a good faith belief that access, use, preservation or disclosure of such information is reasonably necessary to (a) satisfy any applicable law, regulation, legal process or enforceable governmental request

About how and when they access your information:

We restrict access to personal information to Google employees, contractors and agents who need to know that information in order to process it on our behalf. These individuals are bound by confidentiality obligations and may be subject to discipline, including termination and criminal prosecution, if they fail to meet these obligations.

That "need to know that information" seems to restrict casual access, with the threat of termination to back it up. All of these are more specific declarations in line with a general "don't be evil" policy.


A very real threat of termination: http://www.pcmag.com/article2/0,2817,2369188,00.asp


Every such request gets individually checked over by the lawyers. Many get pushed back on.

http://www.google.com/transparencyreport/governmentrequests/...


It's federal law that Google needs to hand over everything they have when the FBI comes calling on these cases. So Google's only choices are hand over the content or wait 30 minutes until people show up and start dismantling their data-center and then go to jail.


It's federal law that Google needs to hand over everything in the scope of the warrant. No law requires Google to give the FBI "everything they have", especially if there isn't a court order.


Nor is the GP post accurate about what Google's options are, since it's obviously an option to bring in the lawyers, resist, and get a court order to block FBI's actions for a while, if Google were inclined to do so.

I tend to scoff at the notion that FBI would start "dismantling data centers" in the manner described.


Note: This was about "Lulz Security group and any affiliated hackers" which means the FBI was far from a ticking time bomb scenario.

http://bits.blogs.nytimes.com/2011/06/21/f-b-i-seizes-web-se... The F.B.I. seized Web servers in a raid on a data center early Tuesday, causing several Web sites, including those run by the New York publisher Curbed Network, to go offline. ... “Our servers happened to be in with some naughty servers,” he said, adding that his sites were not the target of the raid. Curbed is working to get its sites back online, probably by Wednesday.

PS: When the FBI shows up you have zero rights to prevent them from taking anything on site, the only thing a lawyers can do is prevent evidence from being used in a court case after the fact. Granted, at Google scale they are more concerned with backlash and it takes a lot of effort to confiscate that much hardware, but that's not going to stop them in the case of a major incident. Also, by failing to help you are breaking the law let alone trying to stop them.


The question is wrong. Any email provider will read your email, at least for spam detection.

The question should be - does Google keep a profile on you based on the content of your emails?

I'm pretty sure they are able to serve ads without keeping a profile, making the decision solely on the content of the email being read. However, that may not be the case.


The question is wrong. Any email provider will read your email, at least for spam detection.

He's clearly talking about humans reading your mail. No email provider has humans reading your mail for purposes of spam detection, at least not in the usual case.

The question is perfectly valid: seeing that other companies have clear policies on the extent to which employees can access your data, how come Google doesn't have any such policies?


They keep some sort of record on your emails though - a while ago they mentioned that they were "Bringing better ads to gmail" - what they think are relevant ads based on other emails you've received when they can't find relevant content for the one you're currently viewing.

Not sure what kind of profile this ends up being, but it can be quite entertaining when I get ads about sumo wrestling (because of previous emails from appsumo)...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: