Google is the best search engine game in town right now. Their indexing software, Googlebot, visits billions of web sites and tracks and indexes the content, making them available for you when you search. The software is smart and imbued with some extra special artificial intelligence that's proprietary. But, like every robot, it can still use a little help from its friends. And that's you.
When creating a web site, remember that the Googly spider is software, which means it has a set of capabilities and limitations and algorithms it uses to index content. There are many highly efficient ways to foil the spider and make it impossible for it to index your content well. Or, you can help the Googlebot index your site well, and then people will find it when searching for words it contains. And if you have advertising there that pays you money, that means, very simply, a bigger check each month.
As a web site developer, there are a few simple things you can do to help the Googlebot understand your web site and index it happily, bringing you much joy and bliss:
- Keep the number of links on a given page less than 100. This is from Google's Webmaster Guidelines .
- Use the TITLE tag: give every page on the site a complete and meaningful <title> . Google also offers the "allintitle:" directive, which lets users search only text that appears in a page title. There are over 12 million results returned for "Untitled Document "!
Are your pages in this category?
- Avoid frames. Don't use frames at all! Did we say don't use frames? Frames
on a web site are bad for lots of reasons. They prevent the user from bookmarking individual documents. They present related information in separate documents, and that keeps search engines from associating related information. They require the browser to make multiple document requests per document, increasing client-server connections and eating up server CPU cycles, network bandwidth and users' time. If that's not enough, we could give you more. Frames, well - frames suck.
They have become passé.
- Use URLs with query strings sparingly. When using dynamic URLs, like
keep in mind that the shorter the list of query string parameters, the better.
- Make sure that the title and alt tag attributes exist and are complete and meaningful in each page's markup. For example, the markup for that picture of Dr. Dotnetsky should be something like
<img src="/images/doc.jpg" alt="Mr. know it all, Dexter Dotnetsky" />
- Make all relevant information on a page textual. Don't embed page content into images or objects like Flash movies. The Googly Spider can't index that stuff.
- Ensure that your web server supports the If-Modified-Since HTTP header. This feature allows your web server to tell Google whether your content has changed since the Googlebot last crawled your site. Supporting this feature saves you bandwidth and overhead. This comes, again, from Google's Webmaster Guidelines.
- Use robots.txt and meta robots tags to guide. These standards-based directives allow you to specify important things like whether or not Google will cache your page content and/or images, and whether or not the Googlebot will index content on pages that maybe you don't want available to the searching public.
If you are a Blogger, use the meta tags to help the spiders index only your permalinks, not your constantly changing front page. You can do this, with:
<meta name="robots" content="noindex,follow" >
on your front page and
<meta name="robots" content="index,follow" >
on your posts' permanent locations.
- Use meaningful text inside your tags so the Googlebot can associate that text with that href link. For example, if I am going to link my pictures from the Democratic Convention, I should say "Take a look at my photos from the democratic convention " instead of "My Democratic Convention pictures are here." Although Google doesn't explicitly recommend this, it can help.
Don't use link text like read more or go here or download it or, click here.
- Include a <meta name="description" content="[insert your site's description here]">
tag in your page header to summarize your site; even better, include descriptive text on the site's front page where users can actually read it, like, "eggheadcafe.com is a feeble attempt to explain the obvious to programmers by two nerds who have nothing better to do than answer inane forum posts." This text will appear as the description for the site in Google results.
- Forget <meta name="keywords"> ever existed. Really. They're about a slick as dried snot on a doorknob, and they can't even open an umbrella, much less have any value to the search engines or bringing you traffic. Really. Forget 'em.
- Place more important content higher in the markup than less important content in a page.
- Don't try to fool the Googly Spider with hidden links or duplicate content or irrelevant pages of words like "sex" and "hot girls." The Googly Spider does not like being fooled. The Googly Spider will remember, and it will make you sorry that you even thought of doing that.
- In addition,
optimize your site to be as small a download as possible. Remember that when you have script in a web page and you use the <script src="/scripts/thisscript.js"></script> tag arrangement, your script is cached. When the script is inline inside the page, it does not get cached. Avoid large amounts of ViewState in ASP.NET pages. Many nice 3D and shadow text CSS effects can be obtained with IE filter behaviors rather than actual images.
- Finally, if you are a sponsor of context - sensitive advertising such as Google Adsense ads, remember what you are getting paid for. The advertisers are specifically paying you for VISITS TO THEIR SITE when somebody clicks on one of your ad links. So, first, you should be willing to accept the fact that when somebody clicks one of these, they are immediately going off your site in the same browser window. There is no way around this. Further, if this financial arrangement is acceptable to you, you can do a lot of optimization on exactly what content you are showing and where the ads are placed and how they are chosen - the size, location on the page, background and coloring, and so on. This is an entire other subject that we may address in another article. But the main point is, you must provide quality content on each page for your visitors to consume first. The ads will take care of themselves. Don't try to cheat or use tricks like the items mentioned above to "get clicks". It just won't work.
Also, Robbe Morris has a nice article about our experiences with AdSense and placement, as well as the script we concocted to track hits.
Resources for this article came from various sources, including Google's published guidelines, Gina Trapani's scribbling.net, webmasterworld.com, and our own discoveries.