Toward Viable CAPTCHA Alternatives

Shows a not too novel but very workable approach to stopping Blog (and similar) bot spam without having to resort to those ugly CAPTCHA images.

Image - based captchas work reasonably well, but if you are like me, you probably find them annoying as hell, and often it is very difficult to type the letters and numbers presented correctly the first time.  In other words, in their effort to successfully defeat the bots, they throw the baby out with the bathwater.

There have been a number of alternatives presented, but they all seem to suffer from some sort of inherent risks of being defeated by bots or OCR.

Sometimes people are confused about what CAPTCHA techniques can and cannot do. If I want to go to your blog, and spam it, I can fill out the CAPTCHA phrase from your image and spam away. It's the automated spam-bots that CAPTCHA thwarts.

In my opinion, the best method would make it so that the person (or bot) attempting to  "fill out your form" has no idea that any validation is even occuring. That raises the bar a bit, from the "git-go".

This is actually not that difficult to do, because it is possible to set it all up on  the server, and anything on the client side would be hidden from view.

What we need to do is create a unique "key" when a user visits our page with the form. This key can be simple (in this example,just 4 characters), or it can be as strong and complex as you want. This value needs to be stored on the server side (Session is fine) as well as on the client side.

So, we create a unique key for this "Page view", store it in Session,  then on the client side we can put it as a hidden form element, or even a cookie value. When the form is submitted, the values of the server and client key are validated for equality and then purged.

This is relatively secure because it's a two step process.

First, you would have to visit the originating form page in order to get a key generated in the first place. So  automated "bot" submissions from "off site" won't even get a key, and thus they will fail.

Second, the hidden field or cookie name can also be unique for each page view, making it harder for bots to immediately adapt to your site.

In other words, this should be sufficient to stop most bots, though it certainly won't stop all of them. But then, neither will image-based CAPTCHA with OCR bots.  MVP Casey Chesnut illustrated this quite well.

If the crummy images turn you off, then you could try this method and see if it holds up against spammers.

Once again, here is the basic process:

  • Visitor requests your blog page with form. 
  • Server begins processing and generates a unique key, as well as a unique hidden field name with the key or a unique cookie name with the key
  • Key is stored on serverside and written to clientside.
  • Page is delivered.
  • Visitor fills out form and submits.
  • Server compares server and client keys. If they match, the post is allowed.
  • The Keys are dumped, regardless of success or failure, and a  new keyset  and  field/cookie is generated upon failure.
  • You provide a result based on success or failure, in the postback page reload.

Now let's see how this can be easily implemented:

Unique key Generator Class:

using System;
using System.Security.Cryptography;
using System.Text;

    public class UniqueId
    {
        public static string GetUniqueKey(int maxSize)
        {            
            char[] chars = new char[62];
            chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890".ToCharArray();
            int size = maxSize;
            byte[] data = new byte[1];
            RNGCryptoServiceProvider crypto = new RNGCryptoServiceProvider();
            crypto.GetNonZeroBytes(data);
            data = new byte[size];
            crypto.GetNonZeroBytes(data);
            StringBuilder result = new StringBuilder(size);
            foreach (byte b in data)
            { result.Append(chars[b % (chars.Length - 1)]); }
            return result.ToString();
        }   
    }

Page codebehind class for test page:

using System;
using System.Data;
using System.Configuration;
using System.Collections;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Web.UI.HtmlControls;

public partial class _Default : System.Web.UI.Page
{
    protected void Page_Load(object sender, EventArgs e)
    {
        if (!IsPostBack)
        {
            Page.Title = "Sample Bot Thwarter Form";
            DoStuff();
        }


// supply your own value here -form must be submitted from "here"

else if (Request.UrlReferrer.ToString().EndsWith("Default.aspx"))

 

{ string fieldName = (String)Session["fieldName"]; string keyToMatch = Request.Params[fieldName]; string originalkey = (string)Session["key"]; if(originalkey.Equals(keyToMatch,StringComparison.CurrentCulture) ) { lblMessage.Text = "Thanks for your Post!"; this.TextBox1.Text = ""; this.TextBox2.Text = ""; this.TextBox3.Text = ""; } else { DoStuff(); lblMessage.Text = "Sorry, We don't accept bot posts."; } } } private void DoStuff() { string key = UniqueId.GetUniqueKey(4); string fieldName = UniqueId.GetUniqueKey(4); Session["key"] = key; Session["fieldName"]=fieldName; HiddenField hidnfld = new HiddenField(); hidnfld.ID = fieldName; hidnfld.Value = key; Page.Form.Controls.Add(hidnfld); // let's add a couple more "decoy" hidden fields just so the bots will go nuts: HiddenField hidnfld2 = new HiddenField(); hidnfld2.ID = UniqueId.GetUniqueKey(4); hidnfld2.Value = UniqueId.GetUniqueKey(4); Page.Form.Controls.Add(hidnfld2); HiddenField hidnfld3 = new HiddenField(); hidnfld3.ID = UniqueId.GetUniqueKey(4); hidnfld3.Value = UniqueId.GetUniqueKey(4); Page.Form.Controls.Add(hidnfld3); } }

Page (ASPX) Markup:

<%@ Page Language="C#" AutoEventWireup="true" CodeFile="Default.aspx.cs" Inherits="_Default" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" >
<head runat="server">
    <title></title>
</head>
<body>
    <form id="form1" runat="server">
    <div>
        <asp:Label ID="Label1" runat="server" Text="Add A Comment to my Blurg:" Width="213px"></asp:Label><br />
        <br />
        <asp:Label ID="Label2" runat="server" Text="Name:"></asp:Label>
        <asp:TextBox ID="TextBox1" runat="server" Width="159px"></asp:TextBox><br />
        <asp:Label ID="Label3" runat="server" Text="URL:"></asp:Label>&nbsp;
        <asp:TextBox ID="TextBox2" runat="server" Width="161px"></asp:TextBox><br />
        <asp:Label ID="Label4" runat="server" Text="Comment:"></asp:Label><br />
        <asp:TextBox ID="TextBox3" runat="server" Height="112px" 
            TextMode="MultiLine" Width="202px"></asp:TextBox><br />
        <asp:Button ID="Button1" runat="server"  Text="Submit It" /><br />
        <br />
        <asp:Label ID="lblMessage" runat="server" Width="208px"></asp:Label></div>
    </form>
</body>
</html>

If you want to get even more sophisticated, you can use this same technique to dynamically change the id's of ALL form fields, making it very difficult indeed for spammers. All you need to do is store the translations in Session State, so you can pick them back up on a postback.  Another check you can add is to check the HTTP Referer. If it's not from your page or your site, you disallow the post.

Download the Visual Studio 2005 Test Solution (WebSite project)

By Peter Bromberg   Popularity  (2347 Views)