Hacker News new | past | comments | ask | show | jobs | submit login
How I Erased Facebook Comments and Likes (jaruzel.com)
520 points by Jaruzel on March 27, 2018 | hide | past | favorite | 246 comments



You can't "delete" anything from the Internet. Despite Snapchat saying you can. Or a big button that says "Delete". Furthermore, Facebook doesn't need you to have an active, visible profile for it to collect data on you. They still have tracking pixels. "Erasing" comments and likes doesn't do anything or get to the heart of the issue.

The best way I know of to combat Facebook is to poison their data. Which means taking the effort to "friend" strangers across the world, "like" random things, visit websites you don't care about and basically blur your profile to the point that Facebook can't tell who you are or what your actual interests are. It's quite an effort over years.

But no. Everyone wants a button. Which is how we ended up in this situation to begin with.


I don't understand why more people don't understand this. Each "post" is a record with a field called "deleted". You can make that field True. You can't remove that record.

"Deleting" only gives the database owner one more piece of information about you: that you wanted to delete that record.


I assume the did some sort of paranoid delete. I was thinking of getting rid of my Facebook account (once I deal with oauth) and I was thinking if it would be better to delete my account or write a script to flood it with garbage.

I was thinking of having it run as a cron on my laptop. Log in via chromedriver, accept a random couple of suggested friends. Download some random images and upload it tagging myself. Grab some random comments off twitter and post it as my own. Like random posts. Remove a few real friends. Repeat.

Maybe even Facebook Munging as a Service. :D


I've been entertaining this idea for a while.

Train a NN/markov chain on enough of my own and publicly available facebook posts and comments that I can make it spew out realistic-looking posts and comments on demand.

Make another script that likes and unlikes pages, joins and leaves groups.

Make another that uploads random photos (or randomly generated image content) and tags me in them.

Stitch them all together into something that runs all of those on my account, generate so much noise that it becomes a tedious exercise to identify real comments and posts simply by virtue of volume.

For bonus points, run it from my home computer.

Would it be perfect? Probably not, there's probably way more you could do to simulate a real user, but it'd make a fun project if nothing else.


A lazy implementation of this idea might be to just have it post content from https://www.reddit.com/r/SubredditSimulator/


You say lazy, I say making use of existing resources :P (or not re-inventing the wheel).

Seriously though, Subreddit Simulator is probably actually a pretty good idea: a lot of them come with a handy image/gif attached and occasionally you get results that could pass as being written by a human.


That sounds human, albeit confusing. Very cool!


I was thinking the same thing ... but I was going to steal random content from the Twitter firehose. You'll want a bit of smarts behind it so that it's not obviously you (e.g. don't post much when you'd normally be sleeping). It's relatively easy to automate a browser to do this these days (giving you a realistic user-agent and behavior). You'll also want to avoid registering key-presses at an unrealistic (or constant) rate. Apparently you can even be identified by your typing. I guess you could train it on your actual typing but that might be overkill.


You will probably always lose.

It is someones daily job to try rand enrich the data they are getting from you, you are working on your bot(s) on your spare time. You just purely can not spend as much resources to keep ahead of the algorithm.

EvilCorp™ will always gain more benefit out of adapting the algorithm to even include/exclude your bot since they get profit gains across the board.

More over you need to spend a lot of time on your bot(s) to make them blend in. I'm not saying; "don't do it", I'm saying; "it is going to be hard to get real benefits"


It's asymmetric; it is significantly easier to make a clear image harder to read than it is making a fuzzy image clearer, and the same applies to data sets.

There might be swaths of teams all focused on enriching the data, but you eventually can't derive more information out of a deliberately flat information graph, and all it takes is a few extreme data points to distort the average.


It certainly would I dont know what a Markov chain is but it sounds cooler than me dot rand'ing my way through random twitter posts.

Yeah, Oauth and events is the only thing keeping me on FB. Although, I believe they opened up their event system to non-fb users, probably so they could build shadow profiles.


Once Facebook algorithms detect that your account is managed by bots, they can simply discard whatever you did since you started the bots, rendering your efforts useless. I am guessing it wouldn't be that hard to detect account anomaly for Facebook - any abnormal activity, such as liking an unliking pages several times a day, sending random people friend requests etc.


Great idea, and hilarious, BUT: you're going to end up spamming the crap out of your REAL friends with all the nonsense you post, so I think step one would be to block or de-friend anyone you care about in your social graph, first.


That’s perfect! Because then they would start removing me as a friend and the whole process wouldn’t be one-sided :-)


Someone went and did this :D

https://gitlab.com/danso/derivejoy


Even that doesn't guarantee your previous data is gone; nothing is preventing them from maintaining a history of changes, meaning they would have both your new, munged, data AND your older data, pre-munge.

Not sure what you are gaining with this.


My “data” would be a cluster fuck, technically speaking.


If you are in the EU and press delete and the database owner (data controller) does not delete the post then the database owner is now breaking the law. As of ?may? this will carry significant fines.


I don't think it's the case yet. GDPR will enforce the right to be forgotten and as such truly delete data about you, but other than that, I don't think there are other regulations forcing you to delete data.


This is why I said as of ?may?. The GDPR comes into force this year.


I'm not familiar with the specifics of the law, but my guess would be that it does not require that something be deleted upon any delete button click. I would assume it requires full deletion when requested through proper channels.

There are plenty of cases where a system would not function correctly if you are actually erasing DB entries when a user clicks the delete button. In many cases, even the users themselves might expect to be able to undo or go to a delete list and see entries that were deleted.


At my company, I heard them discussing deleting the data the user requested within 1 month.


Oh, my bad, I misunderstood your sentence.


Let's wait and see, I would guess that we are going to see the first cases within the coming five years.


Why be the test case? Your company will be severely impacted by a fine.


No, they're required to delete data if you specifically ask them to (they most probably will change "delete account" to "deactivate" and you'll have to send them a snail mail request if you want a complete removal) and they're not obliged to delete data partially - it's either this (deleted=true) or whole account.


I suggest that you consult a lawyer.

EDIT: In my layman's opinion pressing that button is an explicit request to delete the data. Also the kind of behavior you are suggesting is not private-by-default and goes against the spirit of the law. I don't think that a judge will look kindly on it.

I am not a lawyer. I am not your lawyer. I suggest that you consult a lawyer.


Yes, but right now the law does not apply. In the future they probably will change 'delete' to 'hide' or even get rid of this option and then the rest of what I said applies. I incorrectly said that "they're" instead of "they'll be".


How does the EU propose to enforce this?


Fines of up to 4% of worldwide revenue or €20 million, whichever is larger. Proving that the data is actually deleted is another matter, but the potential upper bounds of the fines alone place a pretty big incentive on treating all data as if it's tainted.


Many smaller tech companies aren't even chasing the European market yet. My guess is that this will make it an even less attractive market.

That being said, the jury is still out on whether the EU can successfully collect fines from exclusively US based companies.

Large corporations like Google or Facebook have a presence in the EU and can be fined directly. Good luck in the courts enforcing gdpr to US only companies who have no servers or physical presence there. I imagine it will be a difficult process and not worth the effort in most cases.


I suspect this is EXACTLY why Cambridge Analytica is located in the UK, and EXACTLY why, they promoted the Brexit campaign: so that they would be exempt from EU privacy laws.


This is why I'm waiting until the GDPR kicks in before deleting my account!


How does the database owner verify that they deleted you?


IANAL, but isn’t "true deletion" legally required under GDPR?

Wouldn't leaving personal data lying around in your database like this technically leave you liable to massive fines?


AFAIK GDPR mandates that deleted data is no longer processed (used for marketing, analyzed, etc). I don't remember it saying anything about the data "existing"; nor could it given how backups work in our industry.


Furthermore, I'd imagine that the host might be required to retain those records for law enforcement purposes (imagine if someone made death threats, etc and law enforcement requested those records for an investigation).


GDPR does have exceptions for legal compliance, yes. If someone does something illegal and they request deletions you can (IMO, not a lawyer) retain the records if you know there is an investigation or court case going or strongly suspect so or initiate it yourself.


You may even be able to keep comments/posts for longer for general compliance. You will need an audit explaining why and to be prepared to defend it. You won't be able to use such data for analytics.

This is based on my personal layman's understanding. I am NOT a lawyer. I am NOT your lawyer. If you need legal advice consult a competent lawyer in your jurisdiction.


Legal/Regulatory compliance is not for Facebook or Amazon.

It is to prevent people going to their Bank, where they have a credit card or a consumer loan, or a mortgage, and say "can I please be forgotten/be wiped from your systems - oh and the loan too!".


> I don't remember it saying anything about the data "existing"; nor could it given how backups work in our industry.

You just store the keys separately, assuming one-way encryption (losing the key) counts as deletion.


IANAL - no idea what "erasure" as in the "Right to erasure" means http://www.privacy-regulation.eu/en/article-17-right-to-eras...

But... if deleted data is no longer processed, does that mean it can not longer be used for targeting etc?


It means deleted as in gone.

Your backups should be cycled after a few months and the deleted data will eventually not be present in any backups.


There’s a bunch of cases where a business should not delete data even if requested by a user; the biggie is crime/fraud prevention. However, if that data was later disclosed in a breach the company would be subject to penalties. The request for removal does count as rescinding consent, so you would no longer be able to do anything that required consent anyhow.

Data could be thought of as radioactive so long as it has a unique identifier. If you can aggregate your fraud detection data in some way to remove the pseudonymous/personal identifiers then you should. If you can’t, then your usecase needs to justify the risk of keeping that radioactive material around.

Watch out that your aggregates can’t be reverse engineered though. There’s a reasonableness test around how easy it would be to recompile a users profile etc. As technology advances things that were once unreasonable become reasonable think md5). I find it helpful to think of reasonableness being connected to the best 10 people you recently interviewed, or the actions of any competitor in the space. If a prosecutor can point to the competitor and ask why you didn’t do what they did, you need a very good reason to pass reasonableness.


Prove we didn't delete it.


It would be extremely risky for such a huge company to pull a stunt like that. All it would take is for one employee to become a whistleblower to put the company into a legal and PR quagmire. And all that for what? Losing 0.1% of their data because of people deleting their account? Not worth it IMO.


In which event Huge Company finds the lowest level employee they can reasonably pin the blame on.


But they still get fined 4% of their global revenue. A few $1.5bn fines and they'll learn how to delete.


I haven’t read the relevant (proposed?) legislation, but aren’t these sorts of penalties usually written as ”fines up to x”?


The EU has recognized that a 200 million fine won't hurt facebook. Instead it says "fines up to X€ or Y% of global turnover, whichever is higher."

I also strongly suspect that in cases like FB or Google, the judges will happily go for the 4% mark if possible.


Ok, great thanks for clearing that up.

Yes, this is a good thing, in opinion. There was a case where a Finnish man was fine 54,000 Euro for speeding, where the fine is calculated based on income[1]. I think this seems like a reasonable way of metering our penalties.

It seems reasonable to me companies should be treated in a similar manner.

1. http://www.bbc.com/news/blogs-news-from-elsewhere-31709454


Atleast for regulations like these, yeah. Big corps like facebook don't care if they get hit with 200 M€. 4% of Facebook's global turnover however...

I'm sure the EU will find some way to spend 2 billion € for something useful. (13% of their net income btw)


Yes. A minor technical cockup when the company is attempting to meet the spirit of the law wouldn't be treated as harshly as a systematic failure, or a deliberate attempt to work around the spirit by following the letter.

If facebook discover a backup from 2012 on a tape that hasn't had items deleted, and adapt their processes so it doesn't happen again, they won't be hit with a $1b fine. If they deliberately refuse to delete people's data as a policy, they will be.


This one is at least...


Maybe I'm naive but I'm not sure how you could spin something as fundamental as "deleting does not actually delete anything" as the work of one rogue employee. What would said employee have to gain from it anyway?

I'm sure Facebook would do anything it could to work around these regulations and lobby as much as possible to have them overturned but I doubt they're reckless enough to simply ignore them and hope for the best.


Pretty easy. Develop code to delete. Code doesn’t work as “designed” and just makes the qc/qa think data was deleted. Automated analysis keeps using it.

No human ever directly accessed data so they don’t “know” it exists.

Ad targeter or whatever needs the data keeps working.

This seems like a trivial programming problem. I’m confused by the confusion.


You need people to develop and maintain this code, and it touches many aspects of architecture. Keeping it contained to a small team and avoiding any leaks doesn't seem trivial to me. What if some dev not in the know notices the issue and submits a fix or a bug report, what happens then? You introduce a pretty big management liability in your management if you need to protect yourself against your own employees. And what if the employee(s) you chose to throw under the bus in case the trick gets noticed decides to rat you out anyway, for fame, money or to avoid repercussions? How do you even find engineers willing to do something so unethical without raising suspicions in the others?

And again, all that for what? Keep profiles on the small minority of users who bother to scrub their profiles, even though these same users are probably not very good targets for ads in the first place? I don't think it would be very rational for Facebook to try something like that unless they're trying to be evil for the sake of being evil.


It would require a whistleblower, which could definitely happen.


Especially if they give a percentage of the fine to the whistleblower.


1% of 4% of Facebook's annual revenue sounds like a good deal to me. Who wouldn't be a whistleblower for $15MM


Prove a negative?

The onus should be on you to prove you did.


Yes.


It's because most people are not computer scientists/hobbyists/engineers. They are laypeople whose knowledge of the internet, and computers in general, is very limited.


One anecdote from my life. In university one of my best friends at the time found the browser function "view source" and was convinces he can now hack the website.


I’ve probably had to explain at least 1000 times that our software being open source does not mean that people can edit the functionality of the software running on our servers.

One thing that is cool about all these pushes to teach every kid programming is that it might help instill “data values understanding.” Or whatever you call the source->object->runtime. And how data works.

Too many people confuse software with data “the data is in the app” and confuse it all with a book in a library. You just burn the book and it’s gone. Or you transfer the book from one library to another. This is a good metaphor for usage, but falls apart if you have privacy or security concerns.

Logic that seems natural and instinct to whatever you call people who program/use tech, is incorrectly understood by others. Especially “digital natives” who always had computers and Internet.

My kid has all of his files in google from forever. This breaks down in situations where Google’s interests are out of sync with his own.


I mean, depending on the quality of the website, he now can.


In the EU you can.


Most often true, luckily for European citizens GDPR is changing that.


>I don't understand why more people don't understand this

Because of the round trip fallacy: People equate the absence of evidence as evidence of absence P(E|A) != P(A|E)


This can be used against them by updating uncompressible guassian white noise over and over... If what you said is true.


Somehow related to your comment, I wanted to post a link to this extension:

https://adnauseam.io/

What it does is basically to hide the adds from you, but clicking all of them.

This is so messed up for add companies, that Google removed it from their App Store.


I think that such strategy with data poisoning will not be effective because based on current data your activity will be marked as anomaly and will not go into any dataset. So this strategy is the same as doing nothing.

But I agree that it's pointless to delete visible data about yourself, because statistics are already stored somewhere else and you don't have power over it.


> based on current data

It might well be useless to start today with your existing accounts. But it's a principle worth knowing about nonetheless. It's certainly possible to poison data from the beginning of your contact with a service - especially for ones less pervasive than Facebook. An account that never sees entirely true data may not be able to build up an accurate profile.

Having said that, providing false information does violate the ToS for most sites. Which probably isn't a federal crime, given US v Lori Drew, but it's certainly not a good idea for an account you can't afford to have suspended.


I wouldn't say its pointless. What if someone gains access to your account through social or technical means? Sure Facebook/NSA/etc have a copy, but at least Customs agents, hackers, and crooks wielding a rubber hose can't access it.


Now there's a business idea.. have an app running on your phone and PC that watches your "normal" facebook activity to establish a baseline, and ever so slightly perverts it over time using the same connection so the IP/location information still lines up. Slowly and methodically so the information doesn't get flagged as bogus. Start with overtly partisan political stuff since that's the easiest to classify.

Call it Social Chaff as a Service (SCaaS).


I have been fantasizing as a combination of data poisoning and intervention when the poisoning is actually real: make groups to swap profiles for a while. Without even going back to the original, but just jumping to the next one. As long as you agree through a different medium and swap credentials with someone you somehow trust, it should work.


why go through all that? who would go through all that? making your own life more difficult in spite of a company? it's not sustainable

note: I know that some people actually would. but philosophically speaking I just don't get it, it's almost like its missing some kind of deeper point.


> I think that such strategy with data poisoning will not be effective because based on current data your activity will be marked as anomaly and will not go into any dataset. So this strategy is the same as doing nothing.

You give them too much credit. This would require some PLM being concerned with such activity enough to add the anomaly detection. Unless they are way proactive over theoretical issues.


Why not build a browser plugin that visits 5 random websites for each site you visit.


https://adnauseam.io/ There you go. Not on the chrome extensions store, but works on chrome (according to their site, I can only vouch for Firefox). I had this question a while ago, and debated building my own, but laziness won out. Edit: I'm an idiot, didn't read that you specified random websites. Adnauseam just clicks on random ads to pollute marketing data.


Why are these people not creating spaceships, exploring the deep of the oceans, or curing cancer?


There is no advertising money in those fields.


Lest you think this is an exaggeration, graduate students and postdocs make peanuts, who would be doing the cancer research, make peanuts.

Starting salary for a postdoc is a bit less than $50k, and this is after 5+ years of grad school, during which you make even less (often $20-30k) One can expect to be a postdoc for 3-5+ years before becoming remotely competitive for an independent (faculty) position, and another 5 years before the job is relatively secure. The competition for these jobs—and the remaining industrial ones—is savage and unpredictable.

This is bad for so many reasons: frantic people don’t do careful experiments or do good science, very smart people opt out all together and figure out how to make you click on ads instead, and a lot of time, money, and effort gets wasted in the churn.

If you want more cancer research and less ad optimization, nag your reps to improve how we fund research!


I feel like you could pull a 'natural flow' subset out.


I feel like searching for natural flow will not give me the results I want.


Not if you have a natural flow for all the random visits.


I mean, at that point you're having an arms race against Facebook's machine learning. My money's on them.


Data poisoning doesn't work. See:

https://www.youtube.com/watch?v=1nvYGi7-Lxo&app=desktop

The gist is social media posts can be used as a unique fingerprint to correlate your whole browser history. Unless you are willing to spam your friends with useless links....

Researchers discovered a politician had a certain medical condition and a judge had "interesting" habits, even in privacy centric Germany. All from publicly available apis and data for sale.

They specifically say their algo is immune to data poisoning.


The 4th paragraph from TFA:

> I know I cannot close the barn door on any data of mine that's already out in the wild, but I can control any further scrapes of my Facebook data by manually removing as much of my Facebook Activity as I can. Unfortunately, and not unexpectedly, Facebook do not give you a simple way to do this.


Their is an excellent book called "Into the Whirlwind" about Stalin and the soviet era purges. One of the strategies of those arrested was to name 2 more. While a great concept in theory, it just made the purges worse.

https://www.amazon.com/Journey-into-Whirlwind-Eugenia-Ginzbu...

I always come back to structural regulation works best. At the end of the day, we just need to make it very costly to keep too much information. We set rates for personally identiftying attributes (e.g storing a birthdate costs x, a partial ssn Y, etc). The charge is per attribute per person per day. This should incentivize the tech companies to develop and store broad scores (e.g scores high for likes sports and classic rock, low for opera) rather than personally identifying info


So your skepticism seems to be aimed not just at Facebook but at "the Internet". So you believe that Google is lying/deceptive in its privacy policy?

https://support.google.com/accounts/answer/465

> When you use Google products and services, we keep some data with your Google Account, like when and how you use certain features. We keep this data even if you delete activity or other items.

> For example, if you go to My Activity and delete a search you did on Google, we'll still know that you did a search, but not what you searched for. What you searched for will no longer be stored with your account.

> We keep this data as long as it's relevant to meet uses like those above. If you delete your account, we remove this data from it.


I've always wondered about Google Drive. If you right-click on a file, you'll notice it says "Remove" instead of the more common "Delete". Perhaps this is a Material Design spec, but I always felt like that was their way of keeping a copy of the file to add to your personal data file.


Remove and delete are distinct actions in shared file systems. “Remove” gets it out of your Google Drive. But if it’s a shared file, it’s not yours to delete, or if your company has a retention policy, you may not have the permission to delete it.


On the web UI they have a "Bin" option, which is sort of like the recycling bin on Windows.

In the Bin folder, right clicking a file gives the option "delete forever"...whether that is actually the case or not I don't know.


A more practical strategy would be not to have a FB account. If you can't beat them, leave them.


> Which means taking the effort to "friend" strangers across the world

Easy enough. More than half of the people that Facebook recommends to be as "friends" are people whose language I don't speak in countries to which I've never been.

Nice AI, Zuckerbean.


Have you considered, that the AI is simply trying to greedily close the points on it's knowledge graph?

Perhaps having a dense graph is simply more profitable for FB?

Sure they won't know who you know, but they can send their messages (ostensibly "on behalf of") that person you know/friended.

Sure, poison your personal data (get your Fakebook on), but going out and spamming connections may not exactly hurt FB.


> Have you considered, that the AI is simply trying to greedily close the points on it's knowledge graph?

No, all I've thought is that the AI sucks and is broken and for a company that's supposed to have the smartest people and unlimited money it should know better.

Beyond that, no.


Sucks for you <> sucks for FB. Facebook (and it's pet AI) likely cares more about it's revenue than it does about you.


Yup and then some other proportion are actually people you know/are friends with. You probably occasionally reach out to the ones you know/are friends with and ignore those who language you don't speak in the countries you've never been to.

All of this is of course a designed test to see how accurate the model is, to see how ordering of suggestions influences behavior, to see effects of noise, etc.

They aren't stupid and they aren't bad at their jobs. It's just a test.


Some years ago i tried to get Tinder to work on my phone. I needed a facebook account for it. No problem. I had several. I created new accoutns each time I logged into facebook for some reason. Tinder still refused to work. Somebody suggested that my facebook account needs at least 50 friends. I found groups in facebook created specifically for the reason of gaining frieds. The next day I had 50+ friends on facebook. This account was locked later by facebook (too many random friends :-P ). I got Tinder to work years later after they dropped facebook account requirenment.


That's the whole point of doing a "simple background check" to determine if an account is real, isn't it. At least I think it is more useful for Tinder users.


Article 17 of the GDPR, The Right To Erasure, states:

Data Subjects have the right to obtain erasure from the data controller, without undue delay, if one of the following applies:

The controller doesn’t need the data anymore

The subject withdraws consent for the processing with which they previously agreed to (and the controller doesn’t need to legally keep it [N.B. Many will, e.g. banks, for 7 years.])

The subject uses their right to object (Article 21) to the data processing

The controller and/or its processor is processing the data unlawfully

There is a legal requirement for the data to be erased

The data subject was a child at the time of collection (See Article 8 for more details on a child’s ability to consent)

If a controller makes the data public, then they are obligated to take reasonable steps to get other processors to erase the data, e.g. A website publishes an untrue story on an individual, and later is required to erase it, and also must request other websites erase their copy of the story.


I like this idea and wonder whether it would be possible to automate this. For it to work, it would have to make random tracking pixel requests 24/7 to prevent Facebook from filtering out the noise. Even then, the individual fake requests would have to be indistinguishable from real ones.


What you could do- is create a user ring, in which you exchange identitys between users, trusting one another- that way for example - a family would blur to one huge identity for the advertisers, loosing all distinctivness.


What’s the advantage in being associated with a Facebook profile that doesn’t represent your true interests, friends, etc? I suppose it may be nice to see poorly targeted ads and thus be less likely to be convinced to spend money? Or if they’re trying to influence your mood, political views, etc., I suppose it might be preferable to receive essentially random influence rather than influence correlated with your actual attributes.


>>The best way I know of to combat Facebook is to poison their data.

some strategies for poisoning data sets are described here https://iotdarwinaward.com/post/improve-your-privacy-in-age-...


Unless I am missing your point, if you just add this to your /etc/hosts file facebook can track any pixels:

https://github.com/jmdugan/blocklists/blob/master/corporatio...


Unbundle your real identity from your online identity. The problem was solved already with handles, until FB came along.


Was it solved?

Just because the marketers don't have your name as "John Argano", they still know everything about your handle "pishpash".

Even if its not your name, its not anonymous. The same profile can be targeted by political parties and manipulated in ways that we are coming to see as problematic.


> They still have tracking pixels.

Add-ons like Privacy Badger or Ghostery can block those though, right?


yes


You know I've heard this over the history of the internet. But there have been lots of things on the internet that have been lost over the years.


Here's a possibility: legislation is passed prohibiting FB (and other websites) from using data that the user has marked as deleted.


Haha good one... and even if that happens, they even follow it by the letter... „we don’t use the data, we just have to search it for illegal activities and therefore our ai needs to parse it... but using it? Noooo!


alternatively, you can block Facebook's tracking (poisoning data might be somewhat easy to spot automatically)


>They still have tracking pixels

You can block that with e.g. a hosts file blocklist that blocks all Facebook related domains.


Or sign in only in an incognito mode.


I use Firefox Focus on Android when I don't care about sessions. Works an absolute treat.


Does someone have a script for this? #interested

I'm on the verge of deleting my Facebook account, but for the same reasons as given by OP i'm hesitant. Facebook is the only platform with i use to connect to some of my closest friends on the other side of the planet. Also staying up to date with events/parties. If there was an alternative for these, i would've ditched Facebook yesterday. I've wasted too many hours of my life scrolling through the feed, being subjected to all kinds of psychological manipulation. Not to mention that Facebook probably knows more about me subjectively than any other person. I'm done.


I "deleted" 11 years of history using this[0] Chrome extension. Took a few days to get through everything, and I had to re-run it a few times because it missed things.

[0] https://chrome.google.com/webstore/detail/social-book-post-m...


It doesn't cover untagging, from I gather. Also, you can run it at 16x speed. It "deleted" ~15 years of history for me in less than an hour. And yes, it does miss a few things.


Do you really believe that it's completely deleted? My theory is that you can delete all you want, they still have it.


No, and I'm pretty sure the parent commenter doesn't either. It's why I put "deleted" in double quotes, as the above comment did.


> It doesn't cover untagging

Using the "Hide" button instead of the "Delete" button I think works on untagging.


It hides it from _your_ timeline, but the backlink to your profile still remains. You have to actually untag it.


So now you've given that extension full access to your Facebook account. And if you didn't uninstall it after using it, it's still able to see everything you do on Facebook. How is this any different from the Cambridge Analytica mess people were upset about in the first place?


Same, bit of a nightmare, but it did get there in the end.


> Facebook is the only platform with i use to connect to some of my closest friends on the other side of the planet.

I keep saying this over and over. If you can't keep in active touch with someone (phone, email, IM, text, Skype) then they're not that important to you. It doesn't even need to be very frequent - once a quarter, every six months.

If you can't manage that level of contact with someone* then there's little point hanging on to the feeling that you're still somehow connected with them. Every generation prior to ours accepted this as their reality and yet we feel we're different somehow.

*Exception: everyone has friends that they may not talk to for years but as soon as you see them or get on the phone with them you can pick up the relationship effortlessly.


Something I've realised over the years, is that you can't "passively" maintain a friendship across the other side of the planet.

You need to contact people directly in order to preserve meaningful relationships with them.

Facebook makes it too easy to keep passive relationships with friends around the world, liking status's and photos are not meaningful interactions in my opinion...

Unfortunately, this is how I no longer connect with a lot of my long distance friends.


I know right? If you never talk to these people they're essentially strangers that you know far too much about.


Summed up my thoughts exactly.


I used a Firefox add-on for this a few months ago, but I don't think it was ever updated for compatibility with the webextensions API. If I'm remembering correctly, it was this: https://addons.mozilla.org/en-US/firefox/addon/fb-purity-cle...

Worked very well for me. No reason you can't spin up an older version of FF (or 52 ESR) just to use this add-on.

I cleaned out all of my account's information first, and then slowly weaned myself off of use of the News Feed when I had free time. Eventually I wasn't even logging in once a week and just up and deleted my account.


The same extension, but for Chrome, is what I use to browse facebook.


Would be cool to make a script that just deletes everything older than 30 days. Then most of your friends wouldn't notice anything unusual about you in their news feed, but you wouldn't be making as much a large dataset for advertising and political marketing


Were I FB, I would make sure that the deletion only made the data invisible to the user and kept it for their customers. After all, we "gave them permission" in the first place so this would not violate that. If they're smart, they're already doing that.


That is surely what is happening - a general (not always, I know!) rule in programming is to never delete anything, just mark it as deleted/deprecated/etc. and do not show it.


This leaves two possibilities:

If they don't let advertisers query deleted data - my original proposal would be effective at preventing third parties from getting your data.

If they DO let advertisers query deleted data - you can 'poison' you account by posting and deleting a huge volume of misleading advertiser-relevant data.


Yeah, I'm pretty sure deleting doesn't really take the data from Facebook. Really all you can do is stop the flow of new data to them.


I'd be quite interested in a script that deletes "everything before X".

I really haven't used FB much over the last few years, but there's plenty of embarrassing high school posts/comments I'd like to remove without purging everything. Doing it by hand isn't really feasible (we're talking a few thousand items here).


please refer to this comment: https://news.ycombinator.com/item?id=16691368

I am using this extension right now after reading that comment.


I'll try that out, thanks!


> I'm on the verge of deleting my Facebook account, but for the same reasons as given by OP i'm hesitant.

I'm in the same boat. Here's a suggestion:

1) Keep your Facebook account.

2) Uninstall all their apps.

3) Replace your profile pic and cover photo with big banners saying you're no longer using Facebook and encouraging others to contact you elsewhere. The text should be big enough it's still readable which your pic is the smallest size.

4) Hide or delete all of your posts, photos, and likes.

5) Make a few posts over a period of time letting people know you're leaving.

6) Specifically contact your closest friends and ask them to keep you in the loop.

7) Bonus points: like random things that are uncharacteristic of you and upload and tag yourself in stock photos (e.g. from https://www.shutterstock.com/). I made all the poison data private to keep from annoying anyone.

This is what I did. I'm no longer on Facebook, but my profile is still up as a reminder to friends. If they're organizing an event, they have a reminder to message me outside of Facebook, and if they don't, I still get the notification.

Just be warned. Facebook will act like a needy ex for awhile, trying to get you back. You just have to unsubscribe from the needy notifications.


Should be titled: "How I Erased A Copy of 5000 Facebook Comments and Likes"


Or "How I have additional data to Facebook about my erasures on top of the copy of data I planned to erase".


"How I added 5000 'erase' activities to my FB profile"


"Then deluded myself and told the story"


Up next: "How I wrote an article that explains how to do something, which actually doesn't do anything, and helps in absolutely no way other than make people feel better they did something rather than nothing"


Up next: how I wrote a comment hating and contributing nothing to the discussion.

Well, it does something: it erases your public comments on FB, and even more so, shows in a few steps how a programmer can go about doing so. Not everything has to be about what FB knows.


Agree. Facebook might still have them in some log or neural thinggy about you, but it will protect you from direct profiling like the Cambridge Analitica debacle. If it would turn out that deleting the activity still allows Facebook to keep them and sell them to 3rd parties, it would be a company-ending event, so there is at least self-preservation.


Why do you presume that once you come off facebook, your data is not used? That is not true, the data gets used and pimped around to the highest bidder. What is true, is that as time passes that data has less value (assuming you actually stopped pumping in the data in fb).


You just defined "raising awareness."


How did I hide my Facebook activity from myself and my friends?


Some friends and I have been talking and we think the better choice would be to write something that would flood your profile with data.

Your stuff may still be there but with that much noise it would make the data useless.


You'd be wasting your time. Your data is versioned. People smarter than you have long thought about this.


It doesn’t matter. This information has a half-life. What I “liked” diminishes in value over time. What advertisers care what I clicked on 1, 5, 10 years ago?


Imagine that at different points in time, the confidence in your persona can be quantified. Say at some point in time last year they were 89% confident of the type of person you were, but today after months of using a ton of random fake likes and clicks they are only 16% sure.

They could choose to ignore low confidence measurements and still assume you are the same person from that point in time last year. Given that they can measure the standard rate of change in people’s interests (per interest even), and how quickly interests fall out of style, they can then extrapolate how relevant certain ads might be for you today based on your old data.

Imagine that. They don’t just own your data today, they own all the forecasts of your data for years to come. Go home and sleep on it.


What if the person who gleefully shared everything on social media, and the person who hides their trail actually are different? People change their beliefs and spending habits all the time. It why we look back on our own writing from years ago and cringe.


The best solution is to just not use Facebook.

Better if you can get your account suspended for something that doesn't provoke a law enforcement response.


Agreed, if you don’t want them to have data on you the best case is to never use anything FB owned, including Instagram and WhatsApp.

If you already use FB services, stop using them and don’t give anymore data. They will still know about you, but at least you won’t give them any new information.


But do they actually do that? It seems like a lot of effort to market to someone who's less likely to be successfully marketed to.


Unless you are mentally disabled in some way, anyone can be successfully marketed. People like to believe they are special snowflakes that can rise above the influence of marketing, but hit them with the right marketing message and you got them.


> People like to believe they are special snowflakes that can rise above the influence of marketing, but hit them with the right marketing message and you got them.

Maybe so, but the success of marketing has different probabilities for different individuals. So, if you're Facebook, it wouldn't make sense to invest in de-poisoning poisoned tracking data if they'd have a low chance of success with their marketing messages. Also, I'm doubtful any de-poisoning technique is perfect, so the resulting profile is probably worse for targeted marketing, all else being equal, as well.


Right, that’s why the only way to win is not to play, there is no safe level of exposure to Facebook or Google.


Sure, for forensics, but how would FB's AI know that your new data is garbage?

Isn't that like NP-hard or something?

Furthermore, would they care? Realistically, you're padding their numbers. Things can be popular simply because they're liked (whether it's synthetic or not).


Depends on how you put in the data.

If you do it all at once it’s trivially easy. They know a person doesn’t just become a new person overnight.

If you commit to creating an elaborate fake personality by doing things your fake self would do, it can be harder, but not impossible. Certainly not a problem for some of the greatest minds in Silicon Valley.


Meh...the whole "Silicon Valley genius" trope is another myth that needs to die. Being able to code a depth first search doesn't make anyone a 'genius' and there are limits to what AI can do even with bushel baskets full of data.

For many years Facebook thought I was Jewish and was spamming me with ads to become a rabbi and move to Israel, but I am the least Jewish person you'd ever meet. And that could easily be determined by publicly searchable data about myself.


I've faked my phone number for about two weeks on Android when I installed their app to try it. 3 years after, they still ask me if my number is some random Italian number, I'm not sure they have that many checks in place.


What you aim is not to remove your data, you want Facebook to think you are a different person than you are by adding false data in your account.


You will not be a different person, you will be “Person who thinks they can fool us into being a different person by feeding in fake data.”

You’ll probably get ads for people who are paranoid or security conscious.


It is a great idea, and I think it can be pulled off without being (easily) detected as decoy activity by making occasional submissions, than like 20 submissions a minute. A very simple way to do it on top of my head:

- Seed a crawler with your own profile (or any other). Make it find 50~100 public contacts with public timeline in your network. Scrape their shares along with their timestamp deltas to their corresponding previous shares. Omit the very first, for that it lacks a proper timestamp delta. Put them all in a collection.

- Pick any one by random, post it all the same, after the same amount of delay as the original share against its previous share. Repeat.

You may shuffle the sentences whenever there are more than just one sentence, to obfuscate things a little.


What you suggest is the only thing that will succeed in preserving privacy. You cannot hide your data as they can still track you even if you do not have an account. We have to lie plausibly to contaminate the data. The important thing is that the data should be carefully generated not to be flagged as synthetic.


Ublock + NoScript + self-destructing cookies makes one much less convenient to track. Your IP and browser headers are all that's left.


You are right but the problem is this: I would like to use Google maps but I do not want Google to have a perfect profile of me so I try to add plausible false data.


I love this idea. Maybe someone could set up a service where you enter your fb password or upload your fb cookie. Then it would just destroy your profile with data.

How would it handle historical data? I don't think you can backdate messages and photos.


It'd look like automated spam, and they'd block/flag it on the spot.


What would happen to your fb account? Do they only block the account or the spam?

Also could you use ML to be more discrete to avoid the spam flag?


Not sure about ML, but there're several problems to solve before that. The service would need IP address that aren't from known cloud provier, otherwise it's trivial for them to notice and block. Also it'd have to look like a real user, so a full browser (posting directly to their endpoints will get you flagged). Then, each user can't post too much and the IP would need to be recycled as well.

Too much trouble to even start that.


You can rent a botnet of desktops from residential IP.

For example, the Hola "unblocker" C&C people have this service. https://luminati.io/


A service that continually updates your profile data (well the non-invariant type), adds/removes random "friends" and the like would be interesting.

Do it progressively enough, and when would FB know the difference?


Something like this? - https://likenoise.com/


Maybe? But with the influx of new activity, it would be reasonably easy to see the line of real activity vs fake activity is on a timeline.


Are the objects actually deleted or just set deleted=true in the db?


I ran one of those greasemonkey scripts to (try to) delete my Facebook profile posts in mid–2016. Some months later I decided to just delete my account entirely. Many of the posts I thought I'd deleted were still in the account archive I downloaded before deleting my account (this was > 30 days after I'd thought I'd deleted them, but less than a year, so it may just be a data retention issue).


Yeah, but at least other people (third parties) are less likely to mine your data.


This would be a really good question to get a Congresscritter to ask a Facebook rep under oath. Maybe it doesn't matter and they'll lie. But I'd still like to get an answer: if a U.S. citizen deletes individual data, or the entire account, is the data retained in any sense? how does this differ for an E.U. citizen?


It still means Facebook can no longer legally use that data, at least in the EU.


If companies let laws get in the way of making money, there would be no need for enforcement bureaus.


FWIW for anyone trying to permanently ditch Facebook, adding these entries to your hosts file is also a good step.

https://github.com/jmdugan/blocklists/blob/master/corporatio...

At the very least, it helps break the habit of visiting Facebook.


Wow, thanks for this.


well I'm pretty sure all these ways to delete your facebook stuff will only soft delete it. I mean it's way more sane to do:

    UPDATE picture SET date_deleted = now() WHERE picture_id = ?
instead of actually deleting it, because so you can recreate something if people deleted their stuff by accident. Means facebook probably still has your data.


Yep, this is how pretty much every large data warehouse works as well. It's very useful to be able to say "What was the state of everything at time X", even if it's been deleted you want to be able to see the state at certain times.

I also imagine with all the data that FB has, they probably have a legal obligation to keep data as it can be useful in solving crimes or gathering evidence. In fact their TOS explicitly states that they will preserve data: "We may access, preserve and share your information in response to a legal request"


i think that's only after they received a request.


That soft delete is the next evolution you learn after learning not to issue SQL DELETE instead. But after that it would be more like event sourcing where an event to delete was issued. And then some sort of snapshot creator would translate that to it's current dataset/view as not being in the data.

The event source can always be replayed to see the data at any point in time, including recreating stuff.


Do we know if Facebook stores the revision history of a given status or comment? Would it be possible to programmatically overwrite every status and comment?


> Do we know if Facebook stores the revision history of a given status or comment? Would it be possible to programmatically overwrite every status and comment?

Yes. Facebook does. There's a quora question that asks "Why does Facebook provide an edit/revision history for edits made to comments?" https://i.imgur.com/GW0x500.png https://archive.fo/EKzQx

One answer says:

> I think its simply so you can see what was their originally. People are allowed to edit their comments now and they could write something one minute, get alot of likes and totally switch it to something else. If the comment starts off with something nice like "Have a nice day" and people like it and afterwards the person who wrote the comment comes back and changes it to something negative, then that would make you look bad. So it is there to make sure everyone can see what was originally there.

Another answer says

> Yes. It allows the audience to keep the content editor accountable for their actions. The transparency in information sets clear expectations for both the content producer and consumer.


We need to solve the problem of companies gobbling up consumer data by means of "connectors". Remember back in the day when social was just a place to write on walls. Now people use it as their identity/email on the internet. It's like Walmart in that regard, the giant mega corp is just all too convenient to resist. We need more innovating ma' and pop shops on the internet if anyone other than ourselves will be stewards of our data in the future.


Ha... I did exactly the same. Automated mbasic to delete/hide from timeline/remove reaction to everything older than the last 30 days.

Also deleted all albums and photos that I could. (leaving a few cover photos and a single profile picture)

I know there'll be backups and it's probably just a soft delete. But it was still an enjoyable process.

By default, facebook should allow you to keep a tailing delete of everything older than XX days if wanted.


Hey Jaruzel,

Do you think you will share the code on github or something similar? Great writeup.


It's just some C# loops, and scanning the DOM for 'a' tags, and then analysing the 'href' attribute on them, and if there's a match, put it in the array. What I wanted to share was the methodology I used to solve the problem.


Seconding this! I wonder if it could be done with some straightforward greasemonkey scripting or something.


Jaruzel - in using Facebook since this purge, what have you noticed? Less relevant ads? Bad friend suggestions? Other?


I've literally just done it (yesterday), so I've not seen anything different yet.


Ironically, this kind of automated activity is a violation of the TOS and may get your account suspended.


It is unlikely facebook deleted anything. Just all the stuff you did is marked as unavailable to the public. It would be kept because it is still profitable to sell the information about your private actions to other corporations.


I asked this a while ago, but when you delete something on FB, the data's probably not completely deleted.

There's likely data retained in backups (unless they have some way of pruning your data from backups), and it's likely that your data already sits on an external platform OUTSIDE of Facebook.

Unless Facebook requires apps that you connect to to delete your information gathered from Facebook when you delete your FB account or when you disconnect an app, you're likely to have your personal data and likes and comments already somewhere else.


Probably not, but how much does that really matter? The main point of these erasures, IMO, is so 'analytics' providers can't sift through it... and I doubt FB makes their backups available to such providers.


I'm not a big fan of Facebook. It's a huge black hole sucking up your time. However, I recently had great success crowd-sourcing a new software feature from a closed FB group. The people were great: informed, smart and helpful. Over a couple of weeks, I was able to release a cutting-edge chart that innovates genetic genealogy.

Just never publish anything you wouldn't be comfortable sharing with the World.


Is this even necessary in the age of GDPR? I mean, with GDPR Facebook can not just thumbstone your profile, it has to delete every copy of every data that can uniquely identify you, which includes "your" likes and of course comments.


Jaruzel wanted to keep their account, but remove their visible activity.


>Is this even necessary in the age of GDPR?

Yes.


So much people here mention it would be pointless to poison the profile data with noise. This is giving up before trying. In the end it's a fight of our algorithms against Facebooks. I'm quite confident who wins.


Emphasis should be on the cessation of commenting, clicking like, or performing any other Facebook activity, or this process would seem to be in vain.

(for sure, the process is only a "soft-delete")


If it is a 'soft-delete' as you and others point out, then at least I know I've done as much as I can without resorting to deleting my account.

And yes, I intend to use FB in 'read-only' mode from now on.


> And yes, I intend to use FB in 'read-only' mode from now on.

If you visit a profile, view a video, open a group or message it's all recorded. I'm betting that they can even tell the probability you read an ad just by seeing if you stop scrolling on it and then start scrolling again.


Even if you delete your account, that information is still probably kept (assuming that you're in the US).


EU/UK here. Hoping that GDPR will force FB to actually delete every copy of what I've tried to delete/hide.


Good luck that they don't bork the GDPR replacement bill they're writing for post-Brexit.


Some actions are completely out of your control, though. Other people can still tag you (on pictures, locations, etc.), even if you hide it from your timeline. If there's an option to prevent people from tagging you, I haven't found it.


There is an option in the settings for that, you need to manually approve tags after that.


I found some websites that do the detective work to find your online traces but don't know if they are that useful. I believe, once the information gone online it's backed up somewhere even if the website deletes your data. For me, that's big data, lol. I learned a new tip today. Thanks.


"How I set the hide flag on all my comments and likes while doing nothing about actually deleting them."


Do you really need to crawl an html? There's APIs to do that already, right?


Does the "right to be forgotten" extend to Facebook?

If I can write to google to not appear in google search results, would it be possible to write to google and ask not to appear in their results when people search my name?


The data is already gone. It's on a developer's server, a developers laptop, or has been laundered and is back in Facebook. Even Facebook has no way of tracking what has been done with the data.


Most harmful is to delete the most recent material. Say everything after 2016. Nothing makes a "social" site look more dead than abandoned accounts.


I wondered how you got that number and you don't explain how to (I don't want to get rid of that piece of memorabilia)


I just had a counter that tracked the DeleteList().Length


That's very clever. I tried to do something along those lines. On a second thought, why bother, its just a flag for them.


When you delete something, I think its not even setting true to the deleted column but just a display:none /s


You could just not give any 3rd party app permissions to see your stuff and achieve the same thing?


I'm pretty sure you also erased all the 2000000 backups that fb makes every minute


Hate to break it to you but you essentially just hid all that shit from the UI.


This is excellent and far more efficient than other tools I've tried.


Where's the code?


It didn't get published


Then publish it you


Funny. I just did the same thing last week.


"How I flipped a couple booleans to `is_deleted=true` in Facebook's DB"


"Never delete data. Storage is cheap. Just mark it as deleted."


Deleting anything given the limitations of Facebook, is very irresponsible. I've invested hours of commenting on stuff. The way Facebook is designed if people delete their comment or posts on which I commented, my content gets wiped out, too.

Other systems have handled this gracefully in the past. I hate when somebody decided to wipe all evidence out and put hours of life into the trashcan.

If you want Facebook to be responsible, set an example, and be responsible and respectful of other people's time and effort!


Is this a joke? Comments on Facebook are not "effort". They are a waste of life.


Yours - sure, mine - mostly not.


I think you are taking Facebook far too seriously.


Seriously, but not too seriously, sorry. And, yeah, that's what we grownups do.


Oh sorry, let me rephrase: you kids are taking Facebook far too seriously




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: