Refreshing ASP.NET Cached Data on a timed basis to avoid delays

Shows how to refresh a datasource on a timed basis with a background timer thread, cache the data, and not have to worry about any delays due to the refreshing of the particular datasource usually taking a long time.

I've written about ASP.NET Caching and its various forms in several different articles here on eggheadcafe.com. However, there is one situation that I have not covered, and this is when you have to get a datasource and it takes a long time to do so. If your cached item expires and has a callback to re-get your data, but it is taking a long time to do so, you run the risk of not having a full copy of your data to serve from the Cache while you are waiting for the new data to come back.

As Steve Smith put it, "...the callback doesn't fire or complete execution prior to the cached item being removed from the cache. Thus, a user will frequently make a request that will try to access the cached value, find that it is null, and be forced to wait for it to repopulate. In a future version of ASP.NET, I would like to see an additional callback, which might be called CachedItemExpiredButNotRemovedCallback, which if defined must complete execution before the cached item is removed."

The approach shown here solves this problem by continuing to use the cache to store our data, but getting the data on a regular interval that is shorter than the cache timeout and doing so on a Timer's background callback thread, thereby ensuring that we will not experience any delays at any time that we need our cached data.

To do this, I set up a System.Threading.Timer in Global, and start it in the Application_Start eventhandler. In this case I am simply using a Google Blog search on "SEO" for 100 items with the results coming back in RSS format as a surrogate for some data-getting process that takes a long time. Actually, this one doesn't take long at all, but I'd rather use a real operation that illustrates a real-world programming usage than have a sample app with a bunch of fake "thread.sleep" operations just to simulate something.

First, let's take a look at the Global.asax code:

public class Global : HttpApplication
    {
        private static Timer timer;
        private int interval = 60000*5; // five minutes

        private static string dataUrl =
            "http://blogsearch.google.com/blogsearch_feeds?as_q=&num=100&hl=en&"
           + "ctz=240&c2coff=1&btnG=Search+Blogs&as_epq=SEO&as_oq=&as_eq=&bl_pt=&bl_bt=&bl_url=&"
           + "bl_auth=&as_drrb=q&as_qdr=d"; 

        protected void Application_Start(object sender, EventArgs e)
        {
            if (timer == null)
            {
                timer = new Timer(new TimerCallback(ScheduledWorkCallback), HttpContext.Current, interval, interval);
                ScheduledWorkCallback(HttpContext.Current);
            }
        }


        public static void ScheduledWorkCallback(object sender)
        {
            HttpContext context = (HttpContext) sender;
            DataSet ds = new DataSet();
            ds.ReadXml(dataUrl);
            DataTable InfoTable = new DataTable();
            InfoTable.Columns.Add("title");
            InfoTable.Columns.Add("link");
            InfoTable.Columns.Add("description");
            InfoTable.Columns.Add("pubDate");
            DataRow row = null;

            for (int i = 0; i < ds.Tables[1].Rows.Count - 1; i++)
            {
                row = InfoTable.NewRow();
                row["title"] = ds.Tables[1].Rows[i]["title_text"];
                row["link"] = ds.Tables[2].Rows[i]["href"];
                row["description"] = ds.Tables[5].Rows[i]["content_text"];
                row["pubDate"] = ds.Tables[4].Rows[i]["published"];
                InfoTable.Rows.Add(row);
            }
            // cache it for six minutes...
            context.Cache.Insert("InfoData", InfoTable, null, DateTime.Now.AddMinutes(6), Cache.NoSlidingExpiration);
        }
I start out with my Timer at the top of the class, and declare the interval.  Then in Application_Start, I check for null and create my timer, passing in the HttpContext, the interval, and a ScheduledWorkCallback method that will handle my callback processing.

The ScheduledWorkCallback method reads my Url string for the Google Blog search, and loads the results directly into a DataSet using the ReadXml method. Since the current HttpContext is being passed in, I have easy access to Cache in the callback.

You'll notice that Google's RSS comes back with Atom namespaces and when the ReadXml method is used, you get 6 DataTables in your DataSet, each of which has data that needs to be "assembled" by row index in order to produce a more standard DataTable containing Title, Link, Description and PubDate.

Finally, after my custom "InfoTable" is assembled, I stick it in Cache with a six minute expiration. My timer interval is only 5 minutes, so I am ensuring that my Cache item will never expire under normal circumstances.

In the ASP.NET Page in the sample solution, I simply have a GridView to display my data from the DataTable that I have stored in the Cache. The page refreshes every two seconds as a "sanity check" for the operation.

The advantage of this approach is that if you have a data-gathering operation that generally takes a long while, you will always be getting a fresh copy of your data immediately out of the Cache, even though a lengthy update operation may be going on during this time on the timer's background thread. The net result is that we always have reasonably fresh data, and no delays for the page to display it because of a Cache item being invalidated or expired. Sure, it would be nice if the Cache class featured an "about to blow" event, but as far as I know, it doesn't.

Incidentally, if you are using this from behind a firewall, you can try adding the following code in your Application_Start:

WebRequest.DefaultWebProxy = System.Net.WebProxy.GetDefaultProxy();



You can download the VS.NET 2005 Web Application Project solution here.
By Peter Bromberg   Popularity  (7664 Views)