Google Sitemaps: Close, But Not Quite There

For those unfamiliar with the technology, Google Sitemaps is a new service offered by Google. It was created as a means of allowing webmasters to submit their various web pages via the Google-created Sitemap Protocol. The Sitemap Protocol is a protocol which makes use of Extensible Markup Language (XML) to allow webmasters to submit their sitemaps in a Google-customized format. These sitemaps are for Google specifically, and do not link to anything.

Webmasters have the opportunity to submit as many or as few pages as they wish from their websites and, assuming there is nothing that would be considered an unethical manipulation, all of the pages listed within a Sitemap will presumably get indexed.

Problems with the Sitemap Protocol Itself

  1. Since webmasters are able to submit as many URLs as they wish, webmasters would also be able to submit as many “auto-generated”, scraped, and other such pages as they wish. Theoretically, the Googlebot should eventually crawl all of these links for the purposes of indexing or not indexing. Since the Sitemap is not meant to be hyperlinked to the site itself, this technique becomes significantly easier.
  2. Since the Sitemap Protocol is XML-based, the vast majority of webmasters will be unfamiliar with how to create the required sitemap code and may experience a certain degree of intimidation as a result. While XML is very similar to HTML/XHTML (in fact, XHTML is a variant of XML), there is a learning curve nonetheless and it is non-standard.
  3. Individual sitemaps must be created for www- and non-www domain names separately. For webmasters who have URLs which make use of both, this can cause difficulty, although this issue is reasonably easy to fix.

Because of these issues, the Sitemap Protocol will prove to be an ineffective long-term solution. However, there are various aspects of the Google Sitemaps services, specifically site ownership verification, which can be used to allow webmasters the ability to indicate what should and should not be indexed.

The Permission-Based Indexing Concept –>
This website is for sale!