We don't sell anything directly on our websites, but I've been fighting a massive botnet of click fraudsters who come to the site off an ad and then visit a bunch of random other pages and submit any forms they encounter with gibberish.
It's quite annoying. In addition to paying for bogus clicks, it throws off all the analytics. And there are thousands of IPs involved (mostly in Asia).
I have an idea for you (aside from banning all of those annoying IPs). Try mapping the courses the random clicks have. If you see a pattern, then once a bot starts the pattern, have a form pop up asking it if it is human. If there is no patter, then you can still initiate a random form pop up asking if the user is a bot. I think you will probably stop some of the bots.
Fine. I'll suggest banning IPs. If you find a form filled with random characters- ban the bot. And then maybe- there might be software which can tell you if certain text is a sentence. If that exists- run that software on the posts, and if there is a large percentage of non sentences, then you can either ban that IP or pop up form asking for human confirmation! If the gibberish is random sentences, you can try searching for common words. If you find none, then you can auto have a form pop up. Gosh. Lots of work =/
I have lots of ideas, however, lots of them wont be easy to implement for various reasons. Well, if you want more ideas- Feel free to ask me!
Man. I've always wanted to design anti-bad-things software. Makes me feel superior?
I've managed to figure out a few peculiar things the bots do while crawling combined with the fact that they seem to only use one of a handful of legit, but very specific user agents. I've got some mod_security rules + a script that combs logs and passes the IP to an iptables script to block them. Are you familiar with mod_security? The default rules are a little too aggressive for my liking, but it's a really fantastic tool.
Adding a CAPTCHA to all forms would be an easy solution, but it's just not practical on many of our forms. I've had a good deal of success adding a hidden form field (that is, one that's set to display:none by a CSS rule) and then ensuring that it hasn't been tampered with when the form is submitted.
The gibberish is typically random letters and numbers, but it's smart enough to fill all numbers in fields expecting phone numbers and email addresses in fields expecting emails.
The scale of the operation is daunting, though. My script is pretty conservative (I really don't want to block legit usrs) and it still picks up a few hundred new IP addresses every day. I expire the bans after a week or the list would get unmanagable.
And dealing with Google is very frustrating. Google says they've already detected all fraudulent clicks and credited us. I think they're wrong, but I don't see how I could possibly prove it.
Sadly, I'm not familiar with lots of securities. I've heard about that form trick and I'm kicking myself for forgetting about it and not mentioning it; however, you already knew so no harm there =p
Yeah. Captcha is an idea that I would have mentioned, but my ideas for captchas are insane. When I start talking about all of the little things I want to do I start sounding ridiculous. But I definitely suggest putting a basic one on all of your forms. It's quite simple to make actually. Php provites all of the necessary "image creation" functions you would need.
Indeed. However, a smart bot will still not fill in a form box with display off. My solution is to provide a blank form and literally telling the user "If you fill out this box, then the submission of this form will be ignored" or something of the like. Of course this wont stop bots which have been fine tuned to "attack" your site. Okay. I'll stop ranting. I just love talking about the subject.
The scale of the operation really is daunting. Sadly without setting up subtle things all over the place and then scrutinizing the data, it becomes hard to do anything at all. BTW that is a LOT of IP addresses. How many of those are actually confirmed bots?
As for proving google wrong. It would be very difficult. I'd love to help if I could haha. I'll throw out a couple of ideas that come to mind seeing as I'm talking about one of my favorite topics and well...you can't possibly stop me!
You can detect the site from which they just came from. If you can confirm the bot
(Yeah. Hard part, but someday if someone doesn't do it first- I'll make the software necessary myself -and of course make it free. Open sourcing it would be a tricky issue because then the enemy can read it! I don't like making a smarter enemy.)
and put that together with the site, you can investigate where the clicks are coming from. If they are coming from several sites with google ad sense or whatever, well. That can turn out to be pretty compelling evidence. (odds are) Furthermore, a lot of those IP addresses will be from Asia. Pretty compelling evidence. K. That was racist- but also a statistically proven fact =/. When you stack up similarities, like maybe even browsing time on your site- you may be surprised at the quality of the argument you can make. I mean. I am assuming the attackers aren't like me. If I were an attacker I would spend weeks forming the software so that I weren't bested by someone like me. But hey. No body is perfect. Find similarities. Get your money back! RAWR
This doesn't read so much like click fraud to me as hijacking a user's browser to convert organic traffic into affiliate credit. The fraudulent clicks are merely a cover story.
There's got to be something more here. Redirects through seven partners - most likely paying on net 30 terms and taking a cut of the revenue themselves for brokering - would reduce the money torrent to a long-delayed trickle. Perhaps the intermediary partners are placing cookies for targeting purposes and therefore adding a bit to the revenue stream?
I'm also surprised that clicks are being generated from Google, of all places. Faking the click from an affiliate of a comparison shopping site would be much more lucrative (higher PPC due to higher purchase intent) and much less likely to result in aggressive countermeasures (comparison shopping sites aren't Google.)
I worked for a site where we were considering going through an intermediary for AdSense because we could earn a greater percentage of Google's take per click than dealing with Google directly! It turns out that major partners can cut such a good deal that they can pass on a better percentage even after taking their own cut. Perhaps Google should consider reducing intermediaries by paying out more fairly, even if doing so increases administrative costs.
Cookie stuffing is an old trick. This seems like a way of defrauding AdSense uniquely by juicing the conversion rate, which is easily doable if one views the conversion code and puts some thought into this sort of method. But it still seems easy enough for Google to catch over time as they track-back the AdSense accounts.
It's quite annoying. In addition to paying for bogus clicks, it throws off all the analytics. And there are thousands of IPs involved (mostly in Asia).