&tbs=rltm:1 [real time results]
&tbs=qdr:s [past second]
&tbs=qdr:n [past minute]
&tbs=qdr:h [past hour]
&tbs=qdr:d [past 24 hours (day)]
&tbs=qdr:w [past week]
&tbs=qdr:m [past month]
&tbs=qdr:y [past year]
Currently the first one (rltm:1) does not work, but it used to and google will probably
resurrect it since they have already implemented realtime results in Google+
search. You can actually replace qdr:n with qdr:n10 for ten minutes, qdr:n30
for last thirty minutes, etc.
I built a search facility back in 2008 using HtmlAgilityPack to scrape these results
and, surprisingly, the original code still works perfectly. This is not something
you would want to deploy on a public website, as when there are too many search
requests from a single IP address, google will start throwing up Captcha controls
to make you prove you're not a "bot". However, it can be a very
useful for research (find out what's hot in a particular subject), or you
could execute a search say, every 5 minutes and cache the result for 5 minutes.
But mostly I wrote it as an exercise in screen-scraping with HtmlAgilityPack, and so I'm sharing it here. What I wanted was a search facility that would
accept multiple keywords, execute a separate search on each, and aggregate and
return the results.
The HTML of typical "search result" in a google page of search results
looks like this:
<li class=g><div class=vsc pved=0CEwQkgowAA sig=1dC><h3 class="r"><a href="http://forums.asp.net/p/1754776/4758868.aspx/1?Newby+Question+Parser+Error"
class=l onmousedown="return rwt(this,'','','','1','AFQjCNGW5y16_55IZr6JT_cfKjO8lC5RAQ','DNgfz1Glaz5mNPNi27l5Xw','0CEoQFjAA')">Newby
Question - Parser Error : The Official Microsoft <em>ASP</em>.<em>NET</em> Forums</a></h3><div class="s"><div class="f kv"><cite>forums.<b>asp</b>.<b>net</b>/p/1754776/4758868.<b>asp</b>x/1?Newby+Question...</cite><span class=vshid></span><button class="gbil esw eswd" onclick="window.gbar&&gbar.pw&&gbar.pw.clk(this)" onmouseover="window.gbar&&gbar.pw&&gbar.pw.hvr(this,google.time())"
g:entity="http://forums.asp.net/p/1754776/4758868.aspx/1?Newby+Question+Parser+Error"
g:undo="poS0" title="Recommend this page" g:pingback="/gen_204?atyp=i&ct=plusone&cad=S0"></button></div><div class="esc slp" id="poS0" style="display:none">You +1'd this publicly. <a href="#" class=fl>Undo</a></div><div class="f slp">1 post - 1 author - Last post: 10 minutes ago</div><span class=st>Microsoft · Feedback on <em>ASP</em>.<em>NET</em>|; File Bugs · <em>ASP</em>.<em>net</em>. Microsoft is conducting an online survey to understand your opinion of the <em>ASP</em>.<em>NET</em> Web site. <b>...</b><br></span></div>
If you are not familiar with HtmlAgilityPack, it is a C# utility originally written
by Simon Mourier that turns an HTML page into a XPATH-compatible XML DOM. So
with a little XPATH knowledge, you can scrape pretty much anything you want out
of a retrieved page of content.
Here is a nugget of code that illustrates how I do this:
public List<SearchResult> GetResults (String searchTerm, object state)
{
StateObject stateObject = (StateObject)state;
if(stateObject.Minutes==120)
baseUrl1 = baseUrl2;
else
baseUrl1 = baseUrl1.Replace("tbs=qdr:n10", "tbs=qdr:n" + stateObject.Minutes.ToString());
searchTerm = searchTerm.Replace(".", "");
string fullUrl = baseUrl1 + searchTerm;
List<SearchResult> results = new List<SearchResult>();
WebClient wc = new WebClient();
string s = wc.DownloadString(fullUrl);
wc.Dispose();
HtmlAgilityPack.HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(s);
HtmlNodeCollection Links = doc.DocumentNode.SelectNodes("//li[@class='g']");
string title = "";
string link = "";
string desc = "";
if(Links==null ||Links.Count==0) return null;
foreach( var node in Links)
{
try
{
SearchResult sr = new SearchResult();
sr.Title = node.FirstChild.FirstChild.InnerHtml;
sr.Title = HtmlHelper.HtmlStripTags(sr.Title, true, true);
sr.Link = node.FirstChild.FirstChild.Attributes["href"].Value;
desc = node.SelectSingleNode("div").InnerText.Trim();
sr.Description = HtmlHelper.HtmlStripTags(desc, true, true);
results.Add(sr);
}
catch(Exception ex)
{
Debug.Write(ex.ToString());
}
}
return results;
}
private ManualResetEvent mre = new ManualResetEvent(false);
private int _numItems = 0;
private int _ctr = 0;
List<SearchResult> AllResults = new List<SearchResult>();
public List<SearchResult> MultiSearch(Dictionary<string,int> searchTerms, int minutes)
{
_numItems = searchTerms.Count ;
foreach (string srch in searchTerms.Keys )
{
string srchr = srch.Replace(".", "");
ThreadPool.QueueUserWorkItem(SearchCallback, new StateObject(srchr, minutes,searchTerms[srch]));
}
mre.WaitOne(5000);
return AllResults;
}
private void SearchCallback(object state)
{
StateObject stateObject = (StateObject)state;
List<SearchResult> result = GetResults((string)stateObject.SearchTerm, stateObject);
if (result != null)
{
AllResults.AddRange(result);
}
_ctr++;
if (_ctr == _numItems)
mre.Set();
}
What this does is:
1) Create the correct search url based on number of minutes the user has input, along
with one or more search terms.
2) Create a State Object which is a simple class to hold the search term and the
number of minutes back to search.
3) Download the search results page and load it into an HtmlAgilityPack HtmlDocument
object.
4) Execute a series of XPATH queries designed to get the search result title, description
and link, and populate a SearchResult instance.
5) Use a REGEX class to strip unwanted HTML tags out of the title and description
content.
6) Add the Search result to a List<SearchResult>
7) Perform this action as many times as there are search terms from the user. Use
a ManualResetEvent to make the code wait until it is done.
8) Return the List<SearchResult> to the caller.
Here are the StateObject and the SearchResult classes:
public class SearchResult
{
public string Title { get; set; }
public string Link { get; set; }
public string Description { get; set; }
}
public class StateObject
{
public string SearchTerm { get; set; }
public int Minutes { get; set; }
public StateObject(string search, int minutes, int topicId)
{
this.SearchTerm = search;
this.Minutes = minutes;
}
}
I have a a simple one-page web application that has a dropdownlist for selecting
the time (minutes), a TextBox for entering one or more search terms, and a button
to kick off the above process. The List<SearchResult> that comes back is
used to bind a DataList.
You can download the complete Visual Studio 2010 solution here.