Getting started with CAPTCHA

What is CAPTCHA?

CAPTCHA stands for “completely automated public turing test to tell computers and humans apart” and is a way for you to try and ensure that the visitor submitting a form on your website is actually a person.

Why use CAPTCHA?

The most common use of CAPTCHA on the web today is to try and prevent the automatic submission of forms by programs designed to submit a form repeatedly, typically called a bot, usually for the purpose of spam. By adding a CAPTCHA to your form, you can cut down on the amount of spam you recieve via a contact form or can prevent bots from signing up for accounts on your website.

Another form of spam which can be reduced through the use of a CAPTCHA is comment spam. Spammers will post links back to their site in order to increase the number of links back to their site to make their site rank higher in search engines.

Types of CAPTCHA

By far the most common type of captcha involves the use of letters that are arranged randomly and are distorted in some way with various background colours. These are the ones that you will most likely have seen when signing up for a web mail account.

An example of the use of these can be seen when you are signing up for a Hotmail…

An example of a CAPTCHA used in the Yahoo Mail signup

or Yahoo! web mail account.

An example of a CAPTCHA used in the Hotmail Mail signup

Another example is the contact form on our website http://www.interspire.com/contact/.

An example of a CAPTCHA used on our contact form

Other alternatives do exist. Audio CAPTCHA’s for the visually impaired are probably the second most common type of CAPTCHA. There are also other CAPTCHA’s that require you to solve a problem that should be easy for a person but very hard for a computer to solve such as choosing which item in a list is not a bird for example but the problem with this is that you need to have a large number of questions before it really becomes effective.

Checking CAPTCHA requirements

The most common requirement of a generating a visual CAPTCHA is GD, which is a graphics library that PHP uses for generating images. To check if GD is installed on your PHP web server, create a phpinfo page by creating a new file on your server with the following PHP code:

<?php phpinfo(); ?>

Next, browse to this PHP file in your web browser. There should be a section called GD. If there is no GD section then GD is not enabled on your server and you won’t be able to use most CAPTCHA systems.

Example GD section from a phpinfo page

Also in this section you can see if FreeType support is enabled. If FreeType is enabled then you will be able to use TrueType fonts in your CAPTCHA’s which will give you a much bigger range of fonts you can use to generate your CAPTCHA, thus reducing the effectiveness of scripts that attempt to automatically recognize letter and number shapes.

If you’re targetting your site towards people with disablities, specifically visually impared users, then using a visual CAPTCHA is a bad idea since they will obviously have trouble entering the characters in the image.

Also, there are various ways to defeat CAPTCHA’s, so they should only be considered one line in your defence against spam. Other tecniques can include blocking web site visiors by I.P. address or by the content in a form submission.

When someone first visits the page on your site that you want to protect with CAPTCHA, we generate a code and store it in the session. This is the code that we want to generate the CAPTCHA image with. In our form near the text field we include an image tag which references a PHP script instead of an image on the server.

/**

* generateSecret* Generate a random string $len chars long and set it to $this->secret** @param string $len Length of string to generate (default 6)** @return mixed False if $len is not a number, otherwise the secret string*/function generateSecret($len=6){if (!is_numeric($len)) {return false;}

// Get a list of all the possible chars to use$allChars = array_merge(range('a', 'z'), range('0', '9'));

// Which characters do we want to exclude because they are too easy// to mistake for other chars$bannedChars = array ('o', 'O', '0','i', 'l', '1', 'I','g', '9','5', 's',);

// Work out our secret

$allowedChars = array_diff($allChars, $bannedChars);$secretChars = array_rand($allowedChars, $len);$secret = '';foreach ($secretChars as $key) {$secret .= $allowedChars[$key];}

return $secret;}

$_SESSION['captcha'] = generateSecret();$_SESSION['captchaLoads'] = 0;


This PHP script will dynamically generate our image for us and also increment a counter in the session. Why keep a counter? Well, without a counter it would be easy to work out the code once and then submit it with different details but the same code over and over again.

$_SESSION['captchaLoads']++;

By checking to make sure this counter isn’t over 1 (assuming we set it to 0 when we generate the secret) when we generate the image, we can be sure that this type of attack can’t happen.

if ($_SESSION['captchaLoads'] > 1) {return '';} else {// Generate our captcha image}

We also append a random string or code to the image url as part of a get string. Your script can easily ignore this but by adding it we can avoid the problem where the web browser might cache the captcha image.

<img realrealrealrealrealsrc=”captchaimage.php?879327def” src=”http://www.interspire.com/content/admin/captchaimage.php?879327def” src=”http://www.interspire.com/content/admin/captchaimage.php?879327def” src=”http://www.interspire.com/content/admin/captchaimage.php?879327def” src=”http://www.interspire.com/content/admin/captchaimage.php?879327def” alt=”CAPTCHA image” />

The last thing to take into account is that you don’t really want the web browser to autocomplete on the captcha field so we need to disable autocompletion for that field.

Your normal input field would look like

<input type="text" name="captcha" id=
"captcha" value="" />

We can disable autocomplete easily by adding the autocomplete=”off” property

<input type=”text” name=”captcha” id=”captcha” value=”” autocomplete="off" />

This method however causes the HTML to not validate completely since autocomplete is a non w3c approved property. To get around this we can disable it with javascript.

<script type="text/javascript">window.onload = function(){	f = document.getElementById('captcha');	if (f) {		f.setAttribute("autocomplete","off");	}}</script>

More PHP Code

We will start with a basic script that reads the text from the session and creates an image from it:

/*** CreateCaptchaImage** Create the captcha image based on the secret in the captcha variables** @return false If we are unable to generate an image return false*/function CreateCaptchaImage(){	// Why generate images if we can't ?	if (!function_exists('imageCreateFromPNG')) {		return false;	}

	if (!function_exists('imagettftext')) {		return false;	}

	if (empty($_SESSION['captcha'])) {		return false;	}

	if (!is_file(dirname(__FILE__).'/captcha.ttf') {		return false;	}

	$width = 150;	$height = 50;	$fontSize = 24;	$x = 30;	$y = 35;	$angle = 0;

	header('Content-type: image/png');

	$img_handle = imageCreate($width, $height);

	// The first colour allocated to the pallet is our background	ImageColorAllocate ($img_handle, 255, 150, 150);

	// Set our text to black	$text_color = ImageColorAllocate ($img_handle, 0, 0, 0);

	imagettftext($img_handle, $fontSize, $angle, $x, $y, $text_color, dirname(__FILE__).'/captcha.ttf', $_SESSION['captcha']);

	// Increment the number of times we have looked at the captcha in the session	$_SESSION['captchaLoads']++;

	ImagePng ($img_handle);	ImageDestroy ($img_handle);

}

CreateCaptchaImage();

The script that generates our image will need to do some things to try and make it harder for a computer program to read the image.

  1. Alter the background

  2. Change the position of the text

  3. Alter the font

Alter the background

We can do this two ways. First, we can do this:

Change

    // The first colour allocated to the pallet is our background
    ImageColorAllocate ($img_handle, 255, 150, 150);

to something like

    // The first colour allocated to the pallet is our background
    ImageColorAllocate ($img_handle, rand(100, 255), rand(100, 255), rand(100, 255));

… which will change the background randomly (we use 100 as the starting value so that, assuming we keep the text black, the text will still be visible to the user fairly easily). We could also use the various GD image fill functions to give us various gradient fills for the background.

Alternatively we can change the line:

$img_handle = imageCreate($width, $height);
which creates a blank image, to something like this:

$file = PickRandomBackgroundFile();
$img_handle = imagecreatefrompng($file);

The PickRandomBackgroundFile function would be something like this:

/*** PickRandomBackgroundFile** Choose a random .png file from the backgrounds directory** @return mixed False on any problem, the full path to the background file if* a file in the right directory with a png extension was found*/function PickRandomBackgroundFile(){	$dir = dirname(__FILE__).'/backgrounds';

	$files = array();

	if (!is_dir($dir)) {		return false;	}

	$dh = opendir($dir);

	if ($dh === false) {		return false;	}

	while (($file = readdir($dh)) !== false) {		// Skip checking the current and parent directory		if ($file == '.' || $file == '..') {			continue;		}

		if (is_file($dir.'/'.$file)) {			$parts = explode('.', $file);			$ext = array_pop($parts);			if (strtolower($ext) == 'png') {				$files[] = $dir.'/'.$file;			}		}	}

	if (empty($files)) {		return false;	}

	return $files[array_rand($files)];}

Change the Position of the Text

The imagettftext function can take an angle at which it will draw the text on an image. So we can change this line:

$angle = 0;

…to this:

$angle = rand(-5, 5);

… and the text will move around a bit (+/- 5 degrees). We should also change the $x and $y a bit too but I’ll leave that up to you.

Altering the Font

Another thing that we can do is choose the font at random from a selection of fonts. We can alter our previous PickRandomBackgroundFile function to be more generic so it looks like this:

/*** PickRandomFile** Choose a random file from the given directory** @param string $dir The directory to look in* @param string $extension The lowercase version of the extension to look for

* @return mixed False on any problem, the full path to the background file if* a file in the right directory with a png extension was found*/function PickRandomFile($dir, $extension){	$files = array();

	if (!is_dir($dir)) {		return false;	}

	if (empty($extension)) {		return false;	}

	$dh = opendir($dir);

	if ($dh === false) {		return false;	}

	while (($file = readdir($dh)) !== false) {		// Skip checking the current and parent directory		if ($file == '.' || $file == '..') {			continue;		}

		if (is_file($dir.'/'.$file)) {			$parts = explode('.', $file);			$ext = array_pop($parts);			if (strtolower($ext) == $extension) {				$files[] = $dir.'/'.$file;			}		}	}

	if (empty($files)) {		return false;	}

	return $files[array_rand($files)];}

We can call the PickRandomFile function like this:

$fontFile = PickRandomFile(dirname(__FILE__).'/fonts', 'ttf')

…and it will pick a random font for us. We can also use this for our backgrounds like so:

$backgroundFile = PickRandomFile(dirname(__FILE__).'/fonts', 'png');

Conclusion

After you’re happy with your CAPTCHA strength, you need to setup the check in your form submission page to check that the person filling out the form has entered the correct code. This is very easy to do and all it takes is a simple if statement:

if ($_SESSION['captcha'] != $_POST['captcha']) {	echo "The image verification failed.<br />\n";	return false;}

…which could easily drop into the function that checks that they filled the form in properly.

You should keep in mind however that if you make your CAPTCHA very strong then sometimes a person might not be able to work out what the text says and may accidently type a letter wrong so if the form is one that they may take a while to fill out like a contact form, you should redisplay their form with the warning rather then taking them to the results page and then going back to a form that they will have to fill out all over again.

A CAPTCHA can be a very effective way to reduce the amount of spam submitted to your site, however you should also be aware of the limitations of a CAPTCHA when choosing if you should implement one on your site.

This entry was posted in Programming. Bookmark the permalink.

6 Responses to “Getting started with CAPTCHA”

  1. Pau says:

    Excellent Article !

    Thks

  2. Douglas says:

    Nice article indeed,
    Thanks Rodney… ;)

  3. Douglas says:

    In the function CreateCaptchaImage
    if (!is_file(dirname(__FILE__).'/captcha.ttf') {
    Needs to be
    if (!is_file(dirname(__FILE__).'/captcha.ttf')) {

    Thanks again Rodney

  4. Alfred Robert Rowe says:

    excellent article. You are a conerstone.
    just as a reminder anybody trying to include the randome font function must comment or delete this lines
    if (!is_file(dirname(__FILE__).'/captcha.ttf')) {
    return false;
    }
    you codes will still work though if you keep the font captcha.ttf and still maintain the fonts folder. but as soon as you delete captcha.ttf from that location your code will fail.

  5. ankit says:

    i have to deliver a seminar on captcha , i have found this very useful and worthwile

  6. Captcha says:

    Hi .. good article .. I noticed another method… the image of an animal and the captcha question said – "Verification
    Please enter the word describing the type of animal shown below. "

Leave a Reply