Build a Dictionary Lookup ASP.NET ServerControl
by Peter A. Bromberg, Ph.D.

Peter Bromberg

As July draws to a close, "deep summer" in Florida causes us transplanted "Noo Yawkas" to make an easy choice about what to do with our spare time: either hang out at the pool or sit inside comfy air-conditioned quarters (at least the pool until 4 PM or so, when torrential rains and lightning force us not only to be inside, but to turn those nasty little PC's off lest they get blown to smithereens along with all the house wiring) -- and we pause to think about where we came from, where we have been, and hopefully, where we are headed.



It's the Web! The web did it, it changed our lives. Long live Sir Tim! I first got on the web sometime around 1994. But, it wasn't until around 2000 that Google and everything else really brought me what could be defined as "The Semantic Web."

You see, there are really two "Webs":

1) the "Internet" web - where there are sites and search engines and movies and everything else we all enjoy, and

2) the Semantic web -this is the web that has meaning, that's searchable and categorizable and is more like a huge knowledge base we can tap into, that we can program against. This, thanks to things like Google, RSS, WebServices, and more -- including, for Microsoft - oriented developers, the .NET Platform.

Now much of the web has "caught up" to the Semantic web concept - but also, a lot of it, that provides useful content, has not. One case in point is answers.com. If you have ever search google for a word or phrase, you'll notice a line with a hyperlink in the upper right side of the results page that looks somewhat like this:

Results 1 - 10 of about 20,500,000 for semantic [ definition ]. ( 0.34 seconds)

I've revised Google's custom link for "definition" above to reflect the actual url it would be on anybody's page, namely "http://www.answers.com/semantic". You can see that the term to be defined is simply appended to the end of the URL, preceded by a whack, and it takes you to answers.com with a nice definition and synonyms and usage example and even links to related queries. Really nice feature, especially since if you are not sure of the spelling of a word, you can use it to see if you get a result. If not, most likely you have your word misspelled.

Problem is, Answers.com hasn't caught up to the Semantic definition of the web - there's no API, only some sort of paste-in HTML that will highlight a word on your web page and allow users to go to the respective result page on Answers.com.

However, scraping some of this content out is as easy as pie. if you view source on the results page, you'll see this:

<meta name="description" content=" se·man·tic ( sĭ-măn ' tĭk ) also se·man·ti·cal ( -tĭ-kəl ) adj. Of or relating to meaning, especially meaning in language" >

Yup, they've actually stuck most of the definition (sometimes all of it, if it will fit) into a META tag in the "content" attribute value, probably so search engines will scoop it. (I mean, you don't spend most of your time reading meta tags, do you?).

On a side note, let me just say that this is an exercise in programming techniques, not a discussion on the legal and / or moral aspects of whether people should engage in scraping web content from other sites. You'll have to decide that for yourself, since you will be the eventual perpetrator (if a programmer could sink to that level).

So now the process of building a custom ASP.NET ServerControl that can display this follows. First, my Control, which I've derived from Panel, because it makes sense in this case - all I need is a position able DIV in which to place my results, which is precisely what the ASP.NET Panel Control is. Plus, I automatically get all the features of the ASP.NET panel - CSS, positioning, font, you name it:

using System;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.ComponentModel;
using System.Collections;
using System.Data;
using System.Text;
using System.Web;
using System.Xml;
using System.Net;
using System.Text.RegularExpressions ;
namespace PAB.WebControls
{
 [DefaultProperty("Text"),
  ToolboxData("<{0}:DictionaryLookup runat=server></{0}:DictionaryLookup>")]
 public class DictionaryLookup : Panel  
 { 
  //example: http://www.answers.com/esoteric
  //<meta name="description" 
  //  content=" es�o�ter�ic ( es ' ?-ter ' ik ) adj. Intended for 
// or understood by only a particular group: an esoteric cult" >
private string text; [Bindable(true), Category("Appearance"), DefaultValue("")] public string Text { get { return text; } set { text = value; } } private string wordToLookup; [Bindable(true), Category("Misc"), DefaultValue("")] public string WordToLookup { get { return wordToLookup; } set { wordToLookup = value; } } public void LookupWord( string targetWord) { // check for designmode or runtime mode if( HttpContext.Current!=null) { if(targetWord !=null) { WebClient clnt = new WebClient(); byte[] b=clnt.DownloadData("http://www.answers.com/" +targetWord); string resultHtml=System.Text.Encoding.UTF8.GetString(b); string theStuff= GetTagByName("meta", resultHtml); theStuff=theStuff.Substring(theStuff.IndexOf("content=\"")+9); int lastPos =theStuff.LastIndexOf("\""); theStuff=theStuff.Substring(0,lastPos); this.text=theStuff; } } } private string GetExpressionForTagContents (string strTagName) { string strPatternTag; if (strTagName == "!") strPatternTag = "<!.*?-->"; else if (string.Compare(strTagName, "!doctype", true) == 0) strPatternTag = "<!doctype.*?>"; else if (String.Compare(strTagName, "br", true) == 0) strPatternTag = @"<br\s*/?\s*>"; else strPatternTag = @"<(" + strTagName + @")(>|\s+[^>]*>).*?</\1\s*>"; return(strPatternTag); } //GetTagByName: Assumes tag exists //Returns everything up to closing tag or //to first < if no closing tag found private string GetTagByName (string strTagName, string strSource) { string strPatternTag = GetExpressionForTagContents(strTagName); string strPatternTagNoClose = "<" + strTagName + @"(>|\s+[^>]*>)[^<]"; RegexOptions opts = RegexOptions.IgnoreCase | RegexOptions.Singleline; Match m; string strGetTagByName; m = System.Text.RegularExpressions.Regex.Match (strSource, strPatternTag, opts); if (m.Value == "") { m = System.Text.RegularExpressions.Regex.Match (strSource, strPatternTagNoClose, opts); if (m == null) strGetTagByName = strSource; else strGetTagByName = m.Value; } else strGetTagByName = m.Value; return(strGetTagByName); } protected override void Render(HtmlTextWriter output) { LookupWord(this.wordToLookup); base.RenderBeginTag(output); if(this.text !=String.Empty) output.Write(Text); base.RenderEndTag(output); } } }

You can see here that I've got several things going on, so let's take a look at them one-by-one:

First, I've created a public Bindable property "WordToLookup" so that the developer can set the word in the Property Sheet as an actual design-time property of the control.

Next, I made a public method "LookupWord" which is really the "guts" of the little lean, mean, LookupMachine. It uses the WebClient class to request the answers.com page with the query term appended, massages the results HTML into a string, and then uses two of the methods from John Vote's "WebWagon" ( www.idioma-software.com ) to handle getting the Meta Tag, and create a Regex expression to get the tag's contents. The last part I got lazy on, because it's so simple. I just pull out the text content with substring matching, since it's very unlikely to ever be changed.

Then, we set the Text property of the control to this found text, which is our definition!

Finally, I override the Render method to ensure that the base Panel control and all its settings are rendered, then my content goes inside it, and then we close the Panel tag. DONE!

Now here is some sample usage, and you can download the complete Visual Studio.Net 2003 solution below and go to town with it.


  private void Page_Load(object sender, System.EventArgs e)
  {
   // you can do either of the following, either on Page_Load or in the eventhandler
   // for a TextBox TextChanged event, for example. Or, you can set the "WordToLookup"
   // property of the control at design-time, in which case you do nothing.
   // Examples:
    //this.DictionaryLookup1.WordToLookup ="esoteric";
    //this.DictionaryLookup1.LookupWord("esoteric");
  }

  //and here, using a TextBox where  the user fills in the word they want defined:
  private void txtWord_TextChanged(object sender, System.EventArgs e)
  {  
   this.DictionaryLookup1.WordToLookup =this.txtWord.Text;
  }

Want to try it out? Click Here!

 

Download the complete VS.NET 2003 solution below

 

 


Peter Bromberg is a C# MVP, MCP, and .NET consultant who has worked in the banking and financial industry for 20 years. He has architected and developed web - based corporate distributed application solutions since 1995, and focuses exclusively on the .NET Platform.

Article Discussion: