Since Google does not like to show websites which have broken pages, errors, malware detected, or duplicate pages in their search results, it is important to monitor, on a regular basis, the health of your website pages.
There are several good tools for doing this monitoring, such as Google Webmaster Tools, Screaming Frog’s Crawler and SEOMoz’s On-Page Optimization tool. I’d like to explore some of the benefits of using Google Webmaster Tools to monitor site performance.
Google Webmaster Tools (GWT)
In GWT there are several sections that are important to monitor on a regular basis, looking for abrupt changes in the metrics. Under the health section, there are six areas: Crawl Errors, Crawl Stats, Blocked URLs, Fetch as Google, Index Status and Malware.
Crawl Errors – In this area, Google will show you crawl errors that their Bot has picked up. Important ones to notice are server errors and not found errors. A server error will usually show a list of 500 errors. It means that something is wrong on the server side. When this occurs, consult with an IT person to determine the problem. Not found errors are usually shown as a 404 error, listing all the pages that are not there any more or has a broken link pointing to a page. These can be quickly be corrected with a 301 redirect to another page.
Crawl Stats – There are three graphs that are monitored in this area, pages crawled per day, kilobytes downloaded per day and time spent downloading a page. Each of these are important to watch. If the “pages crawled per day” or “kilobytes downloaded per day” starts to drop drastically, it is a signal that something is wrong with what Google can see or is crawling. When a big change is detected, it is important to look at the cause or causes of the problem. Some problems could be duplicate titles or content, block pages in the robots.txt file, or spammy links or keywords on pages. The “time spent downloading a page” lets you know how fast your pages load when someone is opening a URL page. The slower the page, the less likely it will be given a good value by the search engines to display in search results. It is important to try and keep load times at least under 2 seconds to load.
Blocked URLs – This area shows the pages or folders in your robots.txt file that you don’t want the search engines to access. So you want to periodically check the pages listed here to make sure that it is not blocking pages that you want crawled.
Fetch as Google – This tool lets you see a page as Google sees it. This is particularly useful if you’re troubleshooting a page’s poor performance in search results.
Index Status – This tool allows you to see the total number of pages Google’s crawler has indexed. It will also display the total number of pages that were not selected or are blocked by the robots.txt file. Pages that are typically not selected are usually redirected to other pages or URL’s whose contents are substantially similar (duplicate content) to other pages.
Malware – Google will not show pages or websites in their search results that have viruses or malware present. So this area will show if Google has detected any type of malware. It is important to act quickly in removing malware from your pages if they are listed here, to keep your website showing up in search results.
There are several other areas in GWT’s that give important information on your traffic, internal and external links, duplicate titles and sitemaps. I will cover the benefits of these tools in a later post.