Last week we discussed some of the on-site factors that affect search engine rankings within a CMS implementation.This week, we’ll revisit these factors and present guidance for addressing and automating these strategies during your CMS implementation.
Enforce W3C Compliant Code
Most mature CMS allow you to define and “lock down” the codeused in templates.By validating thesetemplates prior to deployment you can ensure most site content will be W3Ccompliant.
Some solutions – such as RedDot XCMS – incorporate a codevalidation utility.Automated thirdparty validation is also available, most notably from Watchfire.com or pagescan be validated manually using the W3Cvalidation tools.
Create Generic Site Maps
Virtually all content management solutions allow automatedgeneration and updating of site maps. Best practices suggest restricting the number of links on any one pageto fewer than 100.This may require thecreation of a series of hierarchical Site Maps to provide spiders with quickaccess to all site content.
Deploy Google and Yahoo Site Maps
To the best of our knowledge, no mature content managementsolution is available that creates Google and Yahoo! site maps “out of thebox.” A reasonably skilled developer should be able to extend any CMS with anopen API to produce these maps in the required format.
Alternatively, the non-linearcreations team has developed plug-ins to generate Google and Yahoo! sitemaps for a number of leading mid-tier content management solutions.
Mandate Search Engine Friendly URLs
There are two major models for content management solutionarchitectures:
- StaticPublication.In this model,activities associated with content management are separated from contentdelivery.CMS activities – authoring,editing, workflow – take place on one server (usually inside thefirewall).These files are transferredas static HTML to a separate web server. This server does nothing but serve these static HTML pages to visitors.These systems almost always generate URLs with no dynamic variables.
- DynamicPublication. Dynamic CMS systems assemble content “on the fly” to create apage as it is requested by a visitor. Content management and delivery activities take place within the samesystem.These systems usually generatelong URLs littered with dynamic variables. Some of these systems provide work-arounds for this challenge byallowing URL aliases to be created. These aliases can be standard URLs.
Publish to a Flat Directory
Many mature web content management solutions – particularlythose that publish content statically - allow you to organize contenthierarchically within the CMS independent of its physical location on the webserver.With these systems you canmaintain a much more complex system of organization within the CMS than isapparent in the directory structure of the live web site.
Eliminate Broken Links
Most content management solutions will not publish contentthat contains invalid links to other content maintained by the system.That is, they prohibit the publication ofbroken internal links on the site.With the exception of RedDot CMS, few are able to validate links to externalsites.A number of utilities areavailable for monitoring the validity of links. These include:
- Watchfire WebXM (www.watchfire.com) – suitable for enterprise class sites
- LinkcheckerPro (www.linkcheckerpro.com)
- Xenu Link Sleuth (home.snafu.de/tilman/xenulink.html)
Use Robots.txt Appropriately
Several CMS solutions allow end users to control therobots.txt on a page-by-page basis (most notably, HotBanana), but most leaveits definition to the site developer.
Having a robots.txt file is not absolutely mandatory – itsabsence is most notable for generating 404 errors in server logs and messing upweb log analysis tools. If you choose to implement the file, however, it iscritical that it is valid and accurately defines access to site content. Amistake in implementation can prevent the major search engines from indexingany of your site content.
Address Canonical Issues
You want your visitors to find your site whether they typein http://www.url.com, http://url.com or http://www.url.com/index.html.But you certainly don’t want the searchengines to see these as separate web sites with duplicate content.
Fortunately, you can take a few simple steps to overcomethis challenge.Before you go live withyour newly content-managed site, select one of these URLs as the site URL.Then set up permanent (301) redirects for theother URLs.For example, if you selected www.url.com as your primary URL, you wouldset up a 301 redirect for url.com and www.url.com/index.html.
The major search engines do not penalize permanentredirects.This simple method overcomesmost canonical issues.
Avoid Session Variables
Session variables in the URL GET requests are mostfrequently employed as an alternative to cookies.They “hold state” allowing the system totrack one visitor throughout a visit.Ifyour CMS is assigning session variables you have two safe options and onepotentially risky approach:
- Investigate replacing session variables with cookies as a means of holding state.Many systems provide this as a configurable option.
- Determine whether the use of session variables can be restricted to those parts of the site that absolutely require holding state.For example, an e-commerce element of your site may require the use of session variables, but you may not require their use throughout the site.If this is the case, you may be able to improve the likelihood of pages ranking in search engines by eliminating the variable.
- In the risky approach, you custom configure the system to identify in bound search spiders (by their IP Address or User Agent) and consistently assign the same session variable.This overcomes the session variable challenge by ensuring the search engines recognize the persistence of a page over time.It is risky because any time you treat a search engine spider differently than human visitors you open yourself to accusations of “cloaking,” a decidedly black hat SEO tactic.The penalties applied by search engines when they detect cloaking are harsh and include the possibility of a permanent ban from listings.
It’s a risk that needs to be weighed against the potential gain. If your site is already ranking well, then don’t risk it. If your pages are not being indexed at all, then there is little downside to a ban. Your site is probably in between and you’ll need to make a judgment call.
Reduce Code Clutter
Increasing the clarity and prominence of the text on a pageis one of the simplest, most-effective SEO tactics.
Most mature content management solutions provide the choiceof using cascading style sheets (CSS) to control the format of a page.Making use of a CSS based design – and thenprohibiting the modification of this design – will eliminate much of the HTMLcode that would otherwise be required.
Create Site Navigation as Descriptive Text
When creating the design templates for your site, ensurethat the navigation elements (those links that structure access to the site)are not images or Macromedia Flash objects, but simple text links.Enforce this navigation through the use ofthe content management solutions template functions.
Many content management solutions allow you to automaticallygenerate “bread crumbs” that indicate to visitors where they are in thesite.Use this feature to generate aconsistent key-word rich text element on each page.
The Bottom Line
Embedding SEO best practices during the deployment of a contentmanagement solution can pay large dividends. If you are just about to proceed with your CMS implementation, this is definitelyworth considering.And if you’re readyto expand your SEM strategy, review this Link Building Guide to explorelink building tactics that help improve your search engine visibility.
About the Author
Randy Woods is a co-founder of non-linear creations. With his breadth ofknowledge and experience in online strategy, content management and searchmarketing, Randy shares his lessons learned through the non-linear creationsLeadership Series; a number of published whitepapers including: Best Practices inCMS Governance, SEOand CMS: Best Practices and the NLC PerformanceFramework.