Building Smarter Forms to Block Spam
I personally hate captchas.
Sure they are a great way to stop spam but they are an even better way to stop real users from using your site.
And once you understand how they work, they aren’t even that great. Most take a word that it knows, and then takes a word that it doesn’t know. If you want to have some fun, the easier of the two words to decode is the only word being validated. The other word it has no idea and you can type anything you want for it. While this doesn’t help the goal of recaptcha that are trying to automate decoding print text to digital, its the only thing that gets me through multiple tries of impossible to decode captchas.
Alternatives
Just because I hate captchas doesn’t mean you are left just dealing with spam. Most bots are pretty terribly written and after collecting some samples you can create some pretty good rules to remove spam. Feel free to implement some or all of these:
the a href tag
While a human may put in a url to a site, no human with good intentions will put in a url wrapped in the <a href=””> tag. Just do a quick search for “<a href=” and if you find it, trash the form.
Repeating Entries
If the name, company, address and city are all the same thing, its probably spam.
Submission Time
By placing a form field and dumping the current time stamp (epoch time is easy) and then checking it when you submit the form against the current timestamp, anything under 3-5 seconds is probably spam.
Honeypots
Place 2 form fields, one with a prefilled text and another empty. Wrap the form fields in a css div and then hide the div with css. When you check the form, if the fields don’t match what you built the page with, trash it.
Zip Codes
If the zip code is 123456 then its probably spam.
These don’t work
Referrers
Unfortunately now a days security suites (anti-virus + internet security) started removing referrers from the GET requests. By limiting your forms to referrer fields will cut out a lot of real users too.
Missing Address Info
Often we forget that websites are international and some fields just don’t make sense in some locations. Many countries don’t have states or zip codes or 10 digit phone numbers. Be careful you aren’t cutting out users that you actually want.