Site spam sucks, no doubt about it. I was getting tired of fighting it manually in hand-to-hand combat and decided to get some help. I'd heard about different techniques for thwarting comment, and other types of form, spam but none of them seemed to make a big enough dent in the problem for my liking. That is, until I tried Bad Behavior.
Of course, no single solution will achieve the end-all be-all spam killer that I so eagerly desire but Bad Behavior comes closer to that than anything I've seen. I was able to install BB in about an hour and was pleasantly surprised at how easy they were able to make the process.
Overview
Their website explains it best
Bad Behavior complements other link spam solutions by acting as a gatekeeper, preventing spammers from ever delivering their junk, and in many cases, from ever reading your site in the first place. This keeps your site’s load down, makes your site logs cleaner, and can help prevent denial of service conditions caused by spammers.
The How Bad Behavior Works page contains this little gem
It’s black magic.
Let's take a step back and examine the spammers weapon of choice. Most of the site-spam seen around the web comes from automatic crawlers that skim sites for forms and submit bogus url information hoping that it will automatically be posted to the web. This, in turn, will be picked up by Google - increasing the spammers page-rank or some such. So for the rest of the article, I've chosen these automatic bots as my enemy. There are of course other types of spam but I'll leave that for another article.
Many of these attacks come in the form of a remote cURL executed via a script, probably on a computer infected by a virus. These cURL attacks mask themselves by selecting a common User-Agent such as any number of web browsers. Most sites will not even acknowledge a request if a user-agent is not supplied. Once given it gladly starts to serve up information. This is where Bad Behavior comes in. Bad Behavior reads the user-agent and checks many other browser-specific fields that SHOULD be set.
For example, if BB detects the user-agent Internet Explorer, it will also check to ensure the 'Accept' header is set. If this test fails, the code will kill the page before anything is returned to the bot. The point of this is to somehow hope to trick the bot into believing your site does not exist and should never try to come back (fingers crossed).
Realize that none of this is concrete or fool-proof, but as I've said over and over that it catches a huge amount of unwanted visitors
Installation
Enough of this under the hood stuff. Let's get down to how to install it.
Download
Download Bad Behavior from their site:
http://www.bad-behavior.ioerror.us/download/
Extract
Extract the contents of the zip-archive anywhere on your web-server that is accessible.
Configuration
Open bad-behavior-generic.php in the root of Bad-Behavior. This should be the only file you need to open during the installation/configuration process.
This file is broken down into functions, each requiring a little bit of altering to enable the script. They have organized this file extremely well and left well-worded comments to help you along. Some knowledge of PHP is required for this step but nothing beyond the basics.
The first block of code is the settings array:
$bb2_settings_defaults = array(
log_table => bad_behavior,
display_stats => true,
strict => false,
verbose => false,
logging => true
);
I left the default values here. The good news is that BB fails extremely well. So, for example, even if you have logging turned on, it will fall-back silently if it can not interface with a database.
// Return current time in the format preferred by your database.
function bb2_db_date() {
return gmdate(Y-m-d H:i:s); // Example is MySQL format
}
Here we have a conversion function that BB uses to format the current date/time to that of the database.
<?php
// Return affected rows from most recent query.
function bb2_db_affected_rows() {
return mysql_affected_rows();
}
// Escape a string for database usage
function bb2_db_escape($string) {
// return mysql_real_escape_string($string);
return mysql_escape_string($string); // No-op when database not in use.
}
// Return the number of rows in a particular query.
function bb2_db_num_rows($result) {
if ($result !== FALSE)
return count($result);
return 0;
}
// Run a query and return the results, if any.
// Should return FALSE if an error occurred.
// Bad Behavior will use the return value here in other callbacks.
function bb2_db_query($query) {
$db = new db;
$db->query($query)
if ($db->results)
return $db->results;
return false;
}
// Return all rows in a particular query.
// Should contain an array of all rows generated by calling mysql_fetch_assoc()
// or equivalent and appending the result of each call to an array.
function bb2_db_rows($result) {
$results = array();
while ($row = mysql_fetch_assoc($result)) {
$results[] = $row;
}
return $results;
}
?>
This chunk of code configures all the functions BB used to interface with the database. I use a basic DB abstraction class but many of these lines could easily be replaced. For example, $db->query could easily be replaced with $result = mysql_query(); etc etc
<?php
// Return emergency contact email address.
function bb2_email() {
return "example@example.com"; // You need to change this.
}
?>
Just enter your email address here.
Everything else I left as default. All you have to do now is include the generic config file somewhere at the top of your site before any output and it will take care of the rest.
<?php
require(Bad-Behavior/bad-behavior-generic.php);
?>
When it first runs, it will try to create its logging table. BB uses logging to help gauge whether or not something ought to be allowed into your site or not. It is also advisable that you include this function in the head of every page with a form on it. This will output some javascript that alters the form in an effort to trick spam bots.
<?php
bb2_insert_head();
?>
And that's it! Give it a try and see what happens. You may not know it's even running, but I noticed an immediate and drastic reduction in spam messages left on my own sites. I recommend this to everyone.
Having Trouble?
The installation guide on BB's site for generic web-apps is pretty thin. Visit their installation page for more information. The thing that helped me the most was to open the bad-behavior-worpress.php file included with the install. This is the default configuration for use in wordpress installations. When I had a hard time understanding exactly what each function was expected to return I referenced the Wordpress install and found my answer pretty quickly.


Comments
Ryan on (10.2.02 10:46 pm) says
Mike on (10.3.03 2:28 pm) says