Parsing JSON to C# Classes Via Topsy Otter API and JSON.NET

With Twitter reducing its search history, it is a challenge for applications that are dependent on search results that went not days but even months back. Several applications have taken on the challenge to fill this void and are building search indexes on top of Twitter. Topsy is one of them and it’s sharing its results as an API called "Otter".

Topsy recently indexed its 5 billionth Tweet, so it is giving developers access to a lot of data. The Otter API is a REST-based interface to the Topsy Search Engine. Topsy, which claims to be the largest searchable index of content posted on Twitter, is driven by an architecture that spans a cluster of 500 servers and a petabyte of data. It ranks links, photos and tweets by the number and quality of the tweets.

The Otter API makes available to developers some fairly interesting data that has been mined by Topsy. You can find users who have mentioned a term, or look for experts on a particular topic. Topsy’s "author influence" is used to sorts results, which I believe makes them more valid.

The entire set of Otter API Resources provides a developer with some very interesting indicia to search for. For example, if I wanted to find out the experts on MongoDB, I can use the /experts resource of the Otter API. There’s no API Key required, so you can dive right in. Here’s the Mongo DB example:   http://otter.topsy.com/experts.json?query=mongodb  A nice way to find people to follow.

The Otter API uses a credit allocation system to ensure fair distribution of capacity. Each IP is allocated 10,000 credits per hour.

The typical API call deducts 1 credit from your allocation. Search based API calls (/search, /searchcount, /authorsearch, /profilesearch) have a significantly higher computational cost on the backend, and deduct 10 credits per call.

No developer key or registration is required.  You can read the documentation here: http://code.google.com/p/otterapi/wiki/Resources

I needed some practice deserializing and working with C# types from JSON, so I figured that working out a usable C# library to hit the major Otter API methods would be useful, both now and in the future. I've included only what I consider the most important methods:

AuthorInfo - returns Topsy's Custom Author description with their "Author Influence" property.
Experts - provide a subject phrase like "mongodb norm" or "wcf" and get a list of experts, ranked by influence and frequency of posts.
LinkPosts  - provide a Twitter username and you get back a list of  link Tweets by that author.
Related -  List of related URLs. This list is derived by tracking other URLs that are mentioned in the same tweet as the query URL.
Search - List of results for a query.
Search (site) - List of results for a query using the site: modifier.
Search (user) - List of results for a query using the from: modifier.
Trackbacks - List of tweets that mention the query URL, most recent first. Also accepts a "contains", and an "influential only" modifier.
Trending - List of trending terms.

My approach is very simple: I have a class called "Otter" with a series of static methods named GetXXX (where XXX is the OtterApi operation name).  Each method uses JSON.NET to parse only the part of the returned JSON string that I want to work with, via it's  JObject.Parse(string) method.

This is extremely useful as opposed to the Javascript Serializer or the DataContractSerializer classes, which do not provide any of these helper methods. So for example, if we get back a JSON string like the following:

    {
  "request": {
    "parameters": {
      "window": "d",
      "q": "bernanke",
      "type": "cited"
    },
    "response_type": "json",
    "resource": "search",
    "url": "http://otter.topsy.com/search.json?q=bernanke&type=cited&window=d"
  },
  "response": {
    "window": "d",
    "page": 1,
    "total": 91,
    "perpage": 10,
    "last_offset": 10,
    "hidden": 0,
    "list": [
      {
        "trackback_permalink": "http://twitter.com/paceset9999/status/8590644708638721",
        "trackback_author_url": "http://twitter.com/paceset9999",
        "content": "RT @jetts424: Bernanke Rolling the Dice: America's Financial Dilemma     http://tinyurl.com/39px96f",
        "trackback_date": 1290883143,
        "topsy_author_img": "http://a3.twimg.com/profile_images/1129281907/vet01_normal.jpg",
        "hits": 10,
        "topsy_trackback_url": "http://topsy.com/www.marketoracle.co.uk/Article24599.html?utm_source=otter",
        "firstpost_date": 1290882191,
        "url": "http://www.marketoracle.co.uk/Article24599.html",
        "trackback_author_nick": "paceset9999",
        "highlight": "RT @jetts424: <span class=\"highlight-term\">Bernanke</span> Rolling the Dice: America's Financial Dilemma     http://tinyurl.com/39px96f ",
        "topsy_author_url": "http://topsy.com/twitter/paceset9999?utm_source=otter",
        "mytype": "link",
        "score": 12.836,
        "trackback_total": 8,
        "title": "Bernanke Rolling the Dice: America's Financial Dilemma :: The Market Oracle :: Financial Markets Analysis & Forecasting Free Website"
      },
      {
        "trackback_permalink": "http://twitter.com/dvolatility/status/8725998342242304",
        . . . .

      I would really only be interested in the "response" portion. And of that, probably only it's "list" portion. To do that, I could use the following code:
       JObject stuff = JObject.Parse(json);
       var resp = JsonConvert.DeserializeObject<list>(stuff["response"]["list"].ToString());

       This helps keep processing overhead down.

       Many of these API methods also accept a "page" parameter, so I've included that option as well.  Each JSON object is modeled with a strongly typed C# class. For example:

namespace OtterApi
{

    public class Trackback
    {
         public int page { get; set; }
        public int total { get; set; }
        public int perpage { get; set; }
        public List<TrackbackList> list {get;set; }
    }

    public class TrackbackList
    {
         public string permalink_url { get; set; }
         public string date { get; set; }
        public string content { get; set; }
        public string type { get; set; }
        public AuthorInfo author { get; set; }
         public string date_alpha { get; set; }
    }
}

You can then take these various results and databind or display them. In my simplified WinForms "test harness" I provide a GridView to which the various results are databound.

NOTE: Where present in the documentation, you can also add the "&perpage=100" to the url. I've tried it with 100 but no more than that.

I hope the Otter API and the solution that accompanies this article are useful to you. I'm already coming up with a few interesting ideas for using it.  You can download the C# Visual Studio 2010 Solution here.

By Peter Bromberg   Popularity  (4635 Views)