ASP.NET: Using the Yahoo Term Extraction API to generate keyword phrases for a web page

Often when creating an article or a blog post, we want to generate useful keywords that search engines will index on. This can be done directly from the content itself using the Yahoo Term Extraction API.

The API is extremely easy to use: you pass in a string containing the content as a POST variable, along with your API key, and you get back a string array of terms that Yahoo recommends.

Learn about the Term Extraction API:
Get an API Key here

Here is a class I put together that handles everything:

using System;
using System.Collections.Generic;
using System.Collections.Specialized;
using System.Linq;
using System.Text;
using System.IO;
using System.Web;
using System.Xml;
using System.Xml.Linq;
using System.Net;
using System.Configuration;

namespace GetTags
{
/// <summary>
/// Generates tags (keyword phrases) from content, using the Yahoo Term Extractor API.
/// </summary>
public class TagGenerator
{
/// <summary>
/// The URL of the Yahoo Term Extractor API Endpoint
/// </summary>
const string ADDRESS = "http://api.search.yahoo.com/ContentAnalysisService/V1/termExtraction";

/// <summary>
///  AppID
/// </summary>

         string APP_ID = ConfigurationManager.AppSettings["API_KEY"];
/// <summary>
/// Gets the tags for the specified content.
/// </summary>
/// <param name="content">The content.</param>
/// <returns></returns>
public IList<string> GetTags(string content)
{
             if(String.IsNullOrEmpty(APP_ID) ) throw new InvalidOperationException("API KEY REQUIRED.");
// Create the POST request
    WebClient wc = new WebClient();
    NameValueCollection nvc = new NameValueCollection();
             nvc.Add("appid",APP_ID);
             nvc.Add("context",content);
    byte[] b= wc.UploadValues(ADDRESS, nvc);
    wc.Dispose();
    MemoryStream ms = new MemoryStream(b);
    XmlReader xmlReader = XmlReader.Create(ms);
            XDocument xDoc = XDocument.Load(xmlReader);
             xmlReader.Close();
return GetTermsFromXml(xDoc.Root).ToList();
}


private IEnumerable<string> GetTermsFromXml(XElement root)
{
foreach (var x in root.Descendants())
yield return x.Value;
}
}
}

I then supply a web page with two multiline textboxes and a button. In the top textbox, you paste the contents of the subject web page. When you press the button, the bottom textbox is populated with the list of returned terms. These can be used in the META Keywords or Description tags for the page or for other purposes.

You can download the Visual Studio 2010 solution here. Be sure to get yourself an API key first and fill it in in the web.config.

By Peter Bromberg   Popularity  (5702 Views)