Consuming OPML with LINQ and Syndication Framework

It's time to be "movin' on up" with Framework 3.5 and LINQ to do old things in a brand new way! Consume the New York Times's OPML feed of featured feeds into an XDocument, get the feeds with a LINQ Query, then mash-up only the latest feeds and publish with Syndication Framework.

"Movin' on Up! To a dee-luxe apartment in the sky, We're movin' on up!" -- Jeffersons Theme Song

Well, clap your hands! With .NET 3.0 and 3.5 Framework, we have reason to celebrate for we truly are "movin' on up" to the new Framework in the sky! I sincerely hope you've begun your studies of LINQ and some of the features of the 3.5 Framework, as I have.

Here we are going to do something that's relatively simple, but in a brand new way, and it will prove to be even simpler. I'm going to consume some New York Times OPML of their featured feed entries into an XDocument (LINQ-friendly), run a LINQ query over this XDocument to get the list of urls of each of the featured feeds, then using the Syndication Framework, I'll consume each of the feeds into a SyndicationFeed object and use a second LINQ query to get just the most recent entries, and I'll create a brand new "mashed up" feed of just the newest entries from each featured feed, and publish it out  to the Response.OutputStream. Sound like fun? It is! Here's the code:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Configuration;
using System.Data;
using System.Linq;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.HtmlControls;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Xml.Linq;
using System.ServiceModel.Syndication;
using System.Xml;

namespace FeedMashup
{
    public partial class _Default : System.Web.UI.Page
    {
        protected void Page_Load(object sender, EventArgs e)
        {
          
            if(Cache["mashedFeed"]==null)
            {   
          // get the NYT's OPML of their featured feeds...
          XDocument   doc =XDocument.Load("http://www.nytimes.com/services/xml/rss/nyt/index.opml");
          // grab any level, but only the ones that have a valid xmlUrl attribute
          // using a LINQ Query over the XDocument:

          IEnumerable<string> query =
              from attrib in doc.Descendants("outline").Attributes("xmlUrl")
               where attrib.Value!=null
              select attrib.Value;         
            
            //create a List of type SyndicationFeed
            List<SyndicationFeed> feeds = new List<SyndicationFeed>();
            int ctr = 0;
            //add the first 20 valid feeds
           // foreach (string url in urls)
                foreach(string s in query)
            {
                using (XmlReader feedReader = XmlReader.Create(s))
                {
                    try
                    {
                        feeds.Add(SyndicationFeed.Load(feedReader));
                        ctr++;
                        System.Diagnostics.Debug.WriteLine("Got Feed: " + ctr.ToString());
                    }
                    catch
                    {/*uh-oh, bad format, let's skip him */ }
                }
                if (ctr > 19) break;
            }
            // Make sure the feed items we display are all < 12 hours old
            DateTime minDate = DateTime.Now.AddHours(-12);
            // And a little LINQ:
            var mashedItems =
                from feed in feeds
                from item in feed.Items
                where item.PublishDate > minDate
                orderby item.PublishDate descending
                select item;

            int maxResults = 20;
           // Take only first 20 items, give it a title etc.
           SyndicationFeed  mashedFeed = new SyndicationFeed("New York Times Hot Items", "Custom NYT Feed",null,mashedItems.Take(maxResults));
            // store in Cache for when page is re-requested (could use an expiration here)
            Cache["mashedFeed"] = mashedFeed;
           
            }
            // write our new custom mashed-up feed out from the cached SyndicationFeed
           SyndicationFeed   mashedFeedOut = (SyndicationFeed)Cache["mashedFeed"];
            WriteFeed(mashedFeedOut);
        }
      
        private void WriteFeed(SyndicationFeed mashedFeed)
        {
            using (XmlWriter writer = XmlWriter.Create(Response.OutputStream))
            {
                Response.ContentType = "text/xml";
                // boy this method sure is nice...
                mashedFeed.SaveAsRss20(writer);
            }
        }
    }
}
OK, so what are we doing here?

1) We're loading the NYT published OPML of their featured feeds into an XDocument.
2) Then we create a simple LINQ query over the XDocument to get the "xmlUrls":
      

IEnumerable<string> query =
from attrib in doc.Descendants("outline").Attributes("xmlUrl")
where attrib.Value!=null
select attrib.Value;


3) Now that we have our list of feed urls in "query", we create a List of type SyndicationFeed :

List<SyndicationFeed> feeds = new List<SyndicationFeed>();

and we iterate over our url strings, adding new feeds to it, using an XmlReader:

feeds.Add(SyndicationFeed.Load(feedReader));

4) Finally, we make sure that each feed is less than 12 hours old, and we output the "mashed-up" custom RSS to the output stream:

SyndicationFeed mashedFeedOut = (SyndicationFeed)Cache["mashedFeed"];
WriteFeed(mashedFeedOut);

Mind you, this is a lot less code than the "old way"! Of course, I'm caching the feed so that when the page is  re-requested, we can serve the feed from the cached item. One improvement to this arrangement would be to download each of the feeds asynchronously, which would speed up and parallelize the process. But, I'll save that for a future article.

You can download the Visual Studio .NET ASP.NET Web Application here.

By Peter Bromberg   Popularity  (1912 Views)