ASP.NET Executing Multiple HttpRequests In a WCF Service via ThreadPool and Task Factory

I saw a forum post recently where a user had a WebService method that made GET requests for a number of urls before returning something like a string array of the contents. He was complaining that with a lot of urls, it was taking a very long time, and asked if there was a way to "multithread" this operation.

Although I've written on this subject before, it occured to me that since we now have .NET Framework 4.0, the Task Parallel library gives us newer, more efficient ways to "skin a cat", so I've put together this demo that has two separate services - one that does it via the Threadpool (the "old" way) using a ManualResetEvent to block until completion, and the second, using the TaskFactory class from .NET 4.0.

The client is simply a web page with a label and two buttons. One button calls the Threadpool Service method, and the other button calls into the TaskFactory service with its method. The results are displayed in the label control. To keep this usable, I've only used four urls - but you could do 100 if you wanted to.

Let's have a look at the first service (ThreadPool):

using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Runtime.Serialization;
using System.ServiceModel;
using System.Text;
using System.Threading;
using System.Threading.Tasks;

namespace MultiRequest
{
    public class StateThing
    {
         public string Url { get; set; }
        public int Counter { get; set; }

         public StateThing (string url, int counter)
         {
             this.Url = url;
             this.Counter = counter;
        }
    }
    
     [ServiceContract]
    public class Service1
    {
         private ManualResetEvent mre = new ManualResetEvent(false);
        private string[] _contents;
        private int counter = 0;

        [OperationContract]
        public string[] GetRequests( string[] urls)
       {
        _contents = new string[urls.Length ];

        for (int i = 0; i < urls.Length; i++)
        {
            var state = new StateThing( urls[i],i);
             ThreadPool.QueueUserWorkItem(DoWork,  state);
        }
             // block thread until last one is processed
            mre.WaitOne();
            return _contents;
        }

         private void DoWork(object state)
         {
             var st = (StateThing) state;
             string url = st.Url;
             int ctr = st.Counter;
             var wc = new WebClient();
             string stuff = wc.DownloadString(url);
             _contents[counter] = stuff;
             wc.Dispose();
             if (counter == _contents.Length-1)
                 mre.Set(); // OK! Let 'er go!
             counter++;
         }
    }
}

You can see above, I've created a small class, "StateThing" to hold state. This can include anything you want, it is passed as the second parameter to the ThreadPool QueueUserWorkItem method. So our WebMethod receives a string array of urls, iterates over each creating an instance of the state object, and queues each workitem onto the thread pool.

Now - we need to prevent the GetRequests method from returning until all the urls have been retrieved. One easy way to do this is by using the ManualResetEvent class. When  you call WaitOne on the ManualResetEvent, all further processing on the main thread is halted until something calls the ManualResetEvent's  Set method. We only need to track our items with some sort of counter, as can be seen in  the DoWork method.

The Threadpool has a set number of threads by default (this can be set with SetMaxThreads) and if all are in use it will queue requests until one becomes available. This is an efficient was to process multiple operations in parallel.  When you create a Task or Task(Of TResult) object to perform some task asynchronously, by default the task is scheduled to run on a thread pool thread.

Now let's have a look at the TaskFactory alternative:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Net;
using System.Runtime.Serialization;
using System.ServiceModel;
using System.Text;
using System.Threading.Tasks;

namespace MultiRequest
{
  
    [ServiceContract]
    public class Service2
    {
         private string[] _contents;

        [OperationContract]
        public string[] GetRequestsWithTasks(string[] urls)
        {
            var tasks = new Task[urls.Length];
            _contents = new string[urls.Length];
            int ctr = 0;
             for (int i = 0; i < urls.Length; i++)
            {
                var state = new StateThing(urls[i], i);
                var task = Task.Factory.StartNew(() => DoTask(state), TaskCreationOptions.LongRunning);
                tasks[ctr] = task;
                 ctr++;
             }

             Task.WaitAll(tasks);
             return _contents;
        }

        private void DoTask(object state)
        {
            var wc = new WebClient();
            var st = (StateThing)state;
            var ctr = st.Counter;
            string stuff = wc.DownloadString(st.Url);
            _contents[ctr] = stuff;
        }

        
    }
}

The approach is similar to the first, but instead we call
var task = Task.Factory.StartNew(() => DoTask(state), TaskCreationOptions.LongRunning);
and add each task to an array of Task objects. We then call Task.WaitAll(tasks) which pretty much does the same thing as the ManualResetEvent in the first example.

You can download the complete Visual Studio 2010 Solution here.

Remember: anytime you need to do a lot of "something" at the same time, Parallel is your friend.

By Peter Bromberg   Popularity  (6712 Views)