Firefox for Safer Browsing

Saturday, February 11, 2006

Google sitemap.xml utility

Summer 2007
The major search engines are breathing more life into the sitemap xml file scheme.
Google, live, Yahoo, and Ask have all announced using sitemap.xml files to better understand websites. It's easy to tell the search engines about your sitemap.xml file - simply add a line to your robots.txt file specifying the location. You can also go to the search engines themselves and update them to the presence of the file directly.
An example robots.txt file follows:
* * * * * *
# Company Name or the like
#
# website name / URL
#
# *DO NOT MOVE OR ALTER FILE*
Sitemap: http://mysite.com/sitemap.xml
User-agent: *
Disallow: /css/
Disallow: /images/*
* * * * * *
--------

Google sitemap.xml utility

Google has a sitemap tool that is supposed to be used for indexing websites (faster?) as opposed to having just a 'sitemap.html' in the main folder of your website.
I setup a christmas site using that utility a few months ago.
Since then, I have been watching search engine results on google, yahoo, and msn and find that yahoo and msn would find pages much faster than google did while using this so-called tool. My halloween site was posted on the google-sitemap-utility-site, since it was new, in the hopes that it would get indexed quickly. The only thing that seemed be found quickly by google was the main page, which was just a few days. It took a good 2&1/2 months for any other pages to show up while yahoo and msn had already found those pages in about 3&1/2 weeks.
All the while, I see where google spiders from several different regions have been viewing the sites on a daily basis; sometimes viewing a single page and other times viewing several pages.
It seems to me that this sitemap tool by google is nothing more than an excuse for their spiders to do less work. In any case, I and a others have found that the tool ( google.com/webmasters/sitemaps/login ) is not very efficient.
Summer 2007 The webmaster tools have been improved a bit and seems to work better now.
I have resumed using sitemap.html files for getting my sites indexed by Google.