Asynchronous WebRequest: Basics
by Peter A. Bromberg, Ph.D.

Peter Bromberg

"I'm always multitasking. Eating, on the phone, interview, everything all at once."-- Heidi Klum

We get a lot of posts and questions about multithreading and asynchronous method call patterns. This is an area that is difficult to grasp for many, especially for newer developers who come from the Classic VB space. In classic VB you could actually perform asynchronous operations, but it wasn't fun at all -- it's a lot more fun and much easier with .NET!

The example I'll show is a Console App that does a scan of a number of web sites to see if the site is up and if so, whether the last modified date and / or content length have changed. If you have a couple of hundred links to check, doing this synchronously one - by - one could take a long time. Using the .NET Threadpool with asynchronous calls speeds up the process by orders of magnitude. That's one of the key determinants of whether an operation should be done asynchronously - if it would benefit from parallelism, it's a good candidate. I'll show only the key components of the operation-- extra code such as storing or comparing items from a data store is up to you to implement.



There are seven important points that need to be understood to cover this issue completely:

  1. We need to scan our list in a new thread so that our UI remains responsive, in a real-world app.
  2. We use HttpWebRequest.BeginGetResponse() to initiate an asynchronous request.
  3. We need to use ThreadPool.RegisterWaitForSingleObject() to register a timeout delegate for unresponsive Web requests. Asynchronous WebRequest calls ignore the WebRequest class's Timeout property.
  4. We will use a WebProxy object to allow usage from behind a corporate firewall.
  5. We will get the entire Response stream and store in a string.
  6. We will check the last-modified header. This is not always present, but you can compare the content-length to a previous known value as a workaround to see if the page has changed.
  7. We will use a custom State object to store our key data and objects for processing in the callback.

 

The Code:

using System;

namespace ScanSites

{

 using System.Collections;

 using System.IO;

 using System.Net;

 using System.Threading;

 class Class1

 {

[STAThread]

static void Main(string[] args)

{

ArrayList alSites= new ArrayList() ;

alSites.Add("http://www.blah.com") ; // (yes apparently it's a real site)

alSites.Add("http://msn.com") ;

alSites.Add("http://asp.net") ;

alSites.Add("http://microsoft.com") ;

alSites.Add("http://www.hello.com");// (yup, that's a site too)

ScanSites(alSites);

Console.ReadLine();

}

 

private static void ScanSites ( ArrayList sites)

{

// add the proxy if necessary (dictated by contents of appSettings)

string proxyAddressAndPort = System.Configuration.ConfigurationSettings.AppSettings["proxyAddressAndPort"];

string proxyUserName = System.Configuration.ConfigurationSettings.AppSettings["proxyUserName"];

string proxyPassword = System.Configuration.ConfigurationSettings.AppSettings["proxyPassword"];

System.Net.ICredentials cred;

if(proxyAddressAndPort !=String.Empty)

{

cred = new NetworkCredential(proxyUserName,proxyPassword);

WebProxy p = new WebProxy(proxyAddressAndPort, true, null, cred);

System.Net.GlobalProxySelection.Select =p;

}

 

foreach (string uriString in sites)

{

WebRequest request = HttpWebRequest.Create(uriString);

request.Method = "GET";

 

object data= new object(); //container for our "Stuff"

// RequestState is a custom class to pass info to the callback

RequestState state = new RequestState(request,data,uriString);

IAsyncResult result = request.BeginGetResponse(

new AsyncCallback(UpdateItem),state);

 

//Register the timeout callback

ThreadPool.RegisterWaitForSingleObject(

result.AsyncWaitHandle,

new WaitOrTimerCallback(ScanTimeoutCallback),

state,

(30* 1000), // 30 second timeout

true

);

}

}

 

private static void UpdateItem (IAsyncResult result)

{

// grab the custom state object

RequestState state = (RequestState)result.AsyncState;

WebRequest request = (WebRequest)state.Request;

// get the Response

HttpWebResponse response =

(HttpWebResponse )request.EndGetResponse(result);

Stream s=(Stream)response.GetResponseStream();

StreamReader readStream = new StreamReader( s );

// dataString will hold the entire contents of the requested page if we need it.

string dataString= readStream.ReadToEnd();

response.Close();

s.Close();

readStream.Close();

string lastMod=String.Empty;

if( response.Headers["last-modified"]!=null)

lastMod=response.Headers["last-modified"];

 

Console.WriteLine("Read: "+ state.SiteUrl + ": "+response.ContentLength.ToString() +" bytes. Last-Mod: " +lastMod );

}

 

private static void ScanTimeoutCallback (

object state, bool timedOut)

{

if (timedOut)

{

RequestState reqState = (RequestState)state;

if (reqState != null)

reqState.Request.Abort();

Console.WriteLine("aborted- timeout") ;

}

}

 }

 

 class RequestState

 {

public WebRequest Request; // holds the request

public object Data; // store any data in this

public string SiteUrl; // holds the UrlString to match up results (Database lookup, etc).

public RequestState( WebRequest request, object data, string siteUrl)

{

this.Request = request;

this.Data = data;

this.SiteUrl =siteUrl;

}

 }

}

Analysis:

I use an ArrayList at the beginning to store some test Urls. In a production app or class library, you would probably have a method that gets this data out of the database table, along with information such as the last check date, last content length, etc. We pass this information to the ScanSites method.

We grab the appSettings elements in the config file for proxyAddressAndPort, proxyUserName, and proxyPassword. These are used to construct a GlobalProxySelection WebProxy object that controls all of our requests. Use this if you are behind a corporate firewall.

We do a foreach, iterating over the list of sites to check, creating a new WebRequest for each.

We create a new custom State object to hold our key data. This is passed as the state parameter of the call to the asynchronous BeginGetResponse method:

IAsyncResult result = request.BeginGetResponse(new AsyncCallback(UpdateItem),state);

 

We send the result object's AsyncWaitHandle into our ThreadPool's RegisterWaitForSingleObject method. The RegisterWaitForSingleObject method we use checks the current state of the specified object's WaitHandle. If the object's state is unsignaled, the method registers a wait operation. The wait operation is performed by a thread from the thread pool. The delegate is executed by a worker thread when the object's state becomes signaled or the time-out interval elapses. If the timeOutInterval parameter is not zero (0) and the executeOnlyOnce parameter is false , the timer is reset every time the event is signaled or the time-out interval elapses. Here, this parameter is true. This is how we can overcome the issue with the WebRequest class's Timeout property being ignored in an aysnchronous operation, and abort the request gracefully, freeing up the thread from the pool.

 

Finally, as callbacks are "called back", we process them in our UpdateItem method, extracting the RequestState from the result's AsyncState property, hook it up to the WebResponse, and get whatever we need. A StreamReader is wrapped around the ResponseStream, and we read out the data into a string, which can be stored (or even "massaged" and deposited on the filesystem).

 

Then, we Close our objects so that the Garbage Man will be happy and have something to pick up. Cleanup your room! If you have used anything that has a Close or Dispose method, you need to call it when you're done.

 

Note that I have not included exception handling code in this sample. That's a deliberate action in order to keep the code simpler to understand. However when you write an application, you should ALWAYS have good, well-thought-out exception handling code anywhere that an exception could possibly occur.

 

I hope this exercise in Asynchronous calling patterns is useful to you. In closing, I want to note that this is one of many, many ways to do multithreaded operations. In future articles, I'll cover some new ones, including the use of a custom threadpool.

 

Download the Visual Studio.NET 2003 Solution that accompanies this article

 

Peter Bromberg is a C# MVP, MCP, and .NET consultant who has worked in the banking and financial industry for 20 years. He has architected and developed web - based corporate distributed application solutions since 1995, and focuses exclusively on the .NET Platform.
Article Discussion: