Building A Comprehensive Sitemap for SEO

An XML sitemap file is essential to having search engines correctly find the files in your site. Maintaining a sitemap file can be almost as challenging as keeping your site up to date. In this article we describe a sitemap index file and how to logically layout your site to be more Search Engine Friendly for SEO.

We are constantly being asked to develop more and more complex sites. Those that include static pages, blogs, shopping carts, and more. The advancement of software these days is allowing each of them to create XML sitemap files to define the individual pages of their particular specialty. WordPress blog software can be loaded with a plug-in to allow it to generate a XML sitemap for example. If your install of that software is in a subdirectory say /blog/ then it will generate it in that directory. So a site with a Zen Cart installation in the /store/ directory can have another sitemap in that directory for its produts. Using a sitemap index file you can easily allow each piece of software to generate it’s sitemap and pull them all together into a single file for the search engines to find.

The first thing to do is list out all software that generates a sitemap and where it resides. In the previous examples lets say we also have some static pages, so what we will have is a sitemap at /blog/sitemap.xml, a sitemap located at /store/sitemap.xml and a root sitemap at /sitemap.xml. The first thing to do is rename the /sitemap.xml to something say /sitemap_static.xml.

Now create a new text file called /sitemap.xml. This file will also be XML in the same manner as the sitemaps, but it will list each of the other sitemaps. The first line in the file will be the tag to indicate that it is XML and its encoding:

<?xml version="1.0" encoding="UTF-8"?>

The next line states that it is a sitemap index and the particular schema it adheres to:

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

Now add a sitemap entry for each sub-sitemap that you have giving it a location and a last modified time:

<sitemap><loc>http://www.example.net/sitemap_static.xml</loc><lastmod>2009-11-12</lastmod></sitemap>
<sitemap><loc>http://www.example.net/blog/sitemap.xml</loc><lastmod>2009-11-10</lastmod></sitemap>
<sitemap><loc>http://www.example.net/store/sitemap.xml</loc><astmod>2009-11-15</lastmod></sitemap>

Don’t forget to close out your sitemap index tags:

</sitemapindex>

Now when you make chages to only a portion of your site, you can either submit only the sitemap of the portion that changed to the search engines or the index file and let them determine what has changed. So if you write a new blog entry, and your /blog/sitemap.xml file is updated, you can either resubmit just the /blog/sitemap.xml or /sitemap.xml. Don’t forget to change the lastmod time in your index file when changes are made to the sub-sitemap files.

Put a line in your robots.txt to point to your sitemap index file. This will allow search engine spiders to find the files in your site. You may have submitted your site to a number of search engines, but that one that you left out may still eventually find your site and when it does, the first thing it will look at is the robots.txt file and see your sitemap index. The format for the robots.txt file to include a sitemap is:

Sitemap: http://www.example.com/sitemap.xml

This will tell all the search engines where your index file is and once they read it, they will find that it is an index and pull in all the sub sitemaps making your site much more search engine optimized (SEO)!

For more information on sitemaps, see http://www.sitemaps.org

Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Blogplay
  • LinkedIn
  • MySpace
  • Reddit
  • RSS
  • Socialogs
  • StumbleUpon
  • Technorati
  • Twitter
  • Yahoo! Buzz

Respond to this post