13th August 2008 - 6 minutes read time
I have already talked about converting a sitemap.xml file into a urllist.txt file, but what if you want to create a HTML sitemap? If you have a sitemap.xml file then you can use this to spider your site, scrape the contents of each page and populate the HTML file with this information.
The following code does this. For every page it looks for the title tag, the description meta tag and the first h2 tag on the page. These items are then used to construct a segment of HTML for that page.