Google News Sitemap

Feb 6, 2010 at 11:17 AM
Edited Feb 6, 2010 at 11:29 AM

I've created a News Sitemap handler for a site I manage, as an extension to BlogEngine.

Google News really requires a sitemap that follows the specifications of News Sitemap rather than the default Sitemap schema. See the following Google help page about this...

Google help page - "Should I use a specific format to create my News sitemap?"

I have coded and compiled. However - I'd like to feed this back into your source code repository. However - I don't really have everything set up to do that, I'd rather someone more used to this can just take my code and insert it.

Though - I think that a little adjustment would be required - I don't know enough about C# or the blogEngine codebase.

Suggested changes to my code:

  1. Make sure all the posts are sorted descending to pubDate before outputing them - it looks like this is correct, but I can't work out if it is definitely done somewhere in the codebase
  2. It could be more efficient I suspect - at the moment it literally goes through every post... we only need it to return the top 2 days of posts, sorted as I've just suggested.
  3. I hard-coded some xml nodes for News Sitemap for the particular blog that I was doing it for - clearly this sort of thing needs to be parameterised and moved into the Settings tabs on the Admin section somewhere - so someone would need to do that.

If anyone would like to do any or all of this, I'd be happy.

Code is available here: NewsSiteMap.cs

Should be dropped into ...\BlogEngine.Core\Web\HttpHandlers - added to project and built into the binaries.

In order to activate it ... I added to Web.config for the blog:

		<httpHandlers>
			<!-- ... .. . -->
			<add verb="*" path="newssitemap.axd" type="BlogEngine.Core.Web.HttpHandlers.NewsSiteMap, BlogEngine.Core" validate="false"/>
			<!-- . .. ... -->
		</httpHandlers>


		<handlers accessPolicy="Read, Write, Script, Execute">
			<!-- ... .. . -->
			<add name="NewsSitemap" verb="*" path="newssitemap.axd" type="BlogEngine.Core.Web.HttpHandlers.NewsSiteMap, BlogEngine.Core" resourceType="Unspecified" requireAccess="Script" preCondition="integratedMode" />
			<!-- . .. ... -->
		</handlers>

So that http://www.theblog.com/newssitemap.axd was what was added to the Google Webmasters account for the blog.

So far, so good with Google - articles are showing up within around 2-3 minutes of posting. (The blog's webmaster's account had been notified that they were being added as a trusted source though - so I suspect you should take the time as what will happen)

Cross ref post: Codeplex posting: "Google Sitemap" - April 2009