Keep the Baddies away from your website or blog

There is no question that the level of spambots, botnets, malware and related activity has increased dramatically over the last couple of years. You might not even know it, but there are bots out there feeding on your website or blog, eating up your bandwidth, scraping your site content and attempting to place scripts and links to all kinds of malware. Fight back!

Quis custodiet ipsos custodes? (Who watches the watchmen?)-- Juvenal


You bet - They do it in comment spam, articles they try to post, and script attacks that attempt SQL injection, among other shady techniques.

At the ittyurl.net site that I run for "social searchable short urls" I can't make everyone authenticate in order to just submit a link - So I've developed a set of defenses that go a long way toward stopping the "baddies" in their tracks. Now the solution I present here is very "barebones" - it's just a starting point. I've left out quite a bit of the exception - checking in order to keep the code simpler and more readable.  But if it gives you some ammunition in keeping the quality of your ASP.NET blog or website up to snuff, I'm glad to have been able to help.

I have, basically, three sets of "traps" that I use to stop spam links and content - "Bad Words", "Banned Domains" and Banned IPs". I also check for things like malformed urls or urls with script links, redirects, or other telltale signs that I'm not dealing with a "normal" link. Most of this site-specific code is not included in this sample as it only makes sense for a very specific website. When somebody submits a url to be shortened and indexed, I spider the target web page to compile a list of "tag words" - and in the process, the page gets run through my "Bad Words" list. If there is offensive content, the link submission is automatically rejected. The "banned domains" check doesn't just check for entire domains - it can check for url "fragments" too, and reject them. For example, a lot of spammers join some member site and put up porn advertisements in their profile page. Then they set out to place as many links as they can at sites that accept links. Typically these urls will contain the telltale "members.php" fragment. There's no way that a link with that in it would represent content that is of general interest to the developer community, so  those script-kiddies never even get to my Page Handler. For example, you could not successfully submit this article page to Ittyurl.net because the page has an offensive word in it. Of course, if there is a legitimate page that contains an offending word or phrase, I have engineered a way to override this behavior, but I'm not going to disclose it here. You get the idea.

Here's a short synopsis of how I do the filtering:

1) In my database, I have three tables - BadWords, BannedIPs, and BannedDomains. I have an admin - protected database page that lets me quickly enter free-form sql inserts to add or delete items in any of these tables. I also have an admin - protected "delete links" page that let's me visit any submitted link, and by clicking a second link, I can quickly delete offensive content that's made it past my "filters".

2) In the Application_Start event handler of my sites, I load each of the three database tables into a Generic List of type string.

3) In my global class, I have three static methods each of which returns boolean - IsBadWord, IsBannedDomain, and IsBannedIp. These methods can be called from any page on the site, but the first line of defense is before anybody even gets to a Page handler - I use the Application_PreRequestHandlerExecute handler to do the checking.  PreRequestHandlerExecute is a good place to do this sort of filtering, as it is fired before the Request is handed off to a Page, WebService or other Page-type handler. That saves you time and resources, because if a request is denied it has no chance to tie up a thread with a page handler. It looks like this:

protected void Application_PreRequestHandlerExecute(object sender, EventArgs e)
        {   

             if (HttpContext.Current.Request.Url.AbsolutePath.Length > 2000)
            {
                 // I also use Regex here to check for valid urls.
                Exception ex = new Exception("Url over 2000 char");
                 ex.Data.Add("Host", HttpContext.Current.Request.UserHostAddress.ToString());
                 // log the data using your preferred mechanism--
               // PAB.ExceptionHandler.ExceptionLogger.HandleException(ex);
                HttpContext.Current.Response.StatusCode = 404;
                HttpContext.Current.Response.SuppressContent = true;
                 HttpContext.Current.Response.End();
                  return;
            }
            if (IsBannedIP(Request.UserHostAddress))
             {
                 // PAB.ExceptionHandler.ExceptionLogger.HandleException(
                 // new Exception("Banned IP:" + Request.UserHostAddress.ToString() + ": " + Request.RawUrl +":UA=" +Request.UserAgent) );
               HttpContext.Current.Response.StatusCode = 404;
              HttpContext.Current.Response.SuppressContent = true;
                 HttpContext.Current.Response.End();
                  return;
            }


         string userHostName=   System.Net.Dns.GetHostEntry(Request.UserHostAddress).HostName;

            if(IsBannedDomain( userHostName))
            {
                HttpContext.Current.Response.StatusCode = 404;
                HttpContext.Current.Response.SuppressContent = true;
                 HttpContext.Current.Response.End();
                  return;
            }
        }

In some cases, you may also want to deny requests that show a particular referer that is on your "baddies" list. In this case, you would use:  IsBannedDomain( Request.UrlReferer.ToString() ).

You can download the barebones Visual Studio 2008 Solution and experiment with it. Just create a database "BadGuys" and run the enclosed SQL script to create and populate your tables and add some sample data to each. Make sure the "badGuys" connection string in the web.config matches your environment. 

Since adding and "tuning" the above filters and some more customized filters to my site, I've been able to eliminate 99% of spam link submissions in their tracks - before they even get to a Page handler.  Before doing this, I was having to delete 10 to 15 spam links a day. Now, often a whole week goes by before I have to take manual action to keep the site clean. Have fun, and practice "safe computer". Remember, its "us against them"! Die, spammers!

By Peter Bromberg   Popularity  (1316 Views)