I recently had some horrible problems getting indexed by Google for a site the agency I work for redesigned, and I wanted to share my findings in hopes that it will help others.
Before I begin, you need to know that the site that had these issues was a legitmete site. It was for a non-profit organization that is 100% state funded and has been in existence for 9+ years with no previous Google issues. This site was as legitmate as they come.
The site was built from the ground up in Drupal 7 by the firm I work for, and the web team there has something like 40 years of web building experience between the 4 of us. Yet NONE of us had ever seen anything close to the trouble with Google that we were about.
The Problem with Google Index Begins
Before the launch of the redesign, the site ranked on or about #3 for the keywords they were targeting. One of the main reasons for this redesign was to bump that up to #1.
We launched the redesign of this site on or about May 30. We were not planning to launch until the first week in June, but the client was in a rush, so we accomodated and launched on a Friday afternoon (against our better judgement). Nevertheless, the site launched without a hitch, and everything continued to look fine throughout the next week. Then disaster struck.
The client called us around the middle of June complaining that there site was not showing up on the first page of results anymore. We gave our standard response of “sometimes Google indexing is weird and we promise it will go back up yadda yadda yadda.” Which had been the case for every single website we had ever launched… except this one.
The site had not dropped in the listings, it was gone from Google’s index completely
At first, I could not believe it. I ran every SEO tool and analyzer I could get my hands on to no avail. The site was gone completely from Google’s index. Not a single page was being indexed by Google.
We kicked it into high alert mode then, double checking all of our work, reviewing our launch checklist, reuploading the sitemap.xml, checking Google Webmaster Tools. Everything we knew how to do.
Robots.txt disallow / is the most powerful piece of code on the web
My agency builds large websites. It is not uncommon for a site design and build to take 6-9 months. As a result, we use
User-agent: * Disallow: /
in our robots.txt file on our development server in order to keep Google from indexing any part of the website in development.
In our haste to launch the site on a friday at the client’s request, we neglected to change that one line before we put it onto the production server. It wasn’t on our launch checklist, it was just something we always did by route. Exccept this one time.
And as it turns out “Disallow: /” really works. The site disappeared completely from all search results. Its one stinking powerful piece of code. As stupid as it was to have missed it, we were kind of relieved to find our mistake. “Problem solved,” I thought. We submitted most of the major pages to be indexed through Google Webmaster Tools, and waited a week.
A week later, we checked Google Webmaster Tools. Google had successfully crawled the the pages we submitted for indexing but did not index them.
I feel I need to reemphasize here that this was a high profile website. There were literally thousands of respectable backlinks to most pages on this site. It was not an issue of Google not thinking this site deservd to be indexed. There was something blocking it completely.
Resubmitting Robots.txt to Google Webmaster Tools
Now we were really puzzled and frustrated. We felt powerless. How do you force arguably the most powerful company in the western world to pay attention to your site when their own tools don’t even work?
We did what every frustrated web developer does when they are frustrated – we drank a Diet Mountain Dew and Googled the problem.
The only answer we found that made any sense was that when you use “Disallow: /”, you are actually telling Google not to crawl ANY part of your site, including your robots.txt. So, simply changing your robots.txt won’t work, because Google won’t crawl it anymore.
The Solution? Resubmit the robots.txt file for indexing with Google webmaster tools. You can do that by logging into you Google Webmaster Tools acoount, and then clicking on “Fetch as Google” in the left sidebar.
Still the Site was not Indexed by Google!
Again, we waited a week to give Google time to reindex the robots.txt file. When we checked back, Google had indeed fetched the robots.txt file, but they had not indexed the site! By this time it had been almost 2 months since the launch of the redesign and we were desperate! What could possibly be wrong. That’s when we found out about Manual Actions.
Manual Actions by Google are Scary
So scary, in fact, that I did not even know they existed. I had never heard of them before. But if you click on the “Search Traffic” link in Google Webmaster Tools, then click on “Manual Actions,” you can see if your site has one. Ours sure did.
What Google Manual Actions are
When Google detects what it considers to be a violation to its Webmaster Guidelines they will add a manual action to that site, essentially black balling it from being indexed. The key term here is “manual” which seems to imply that a Google employee noticed the violation and manually flagged the site. But I do not believe this to be case.
When we noticed that the manual action had been taken, we immediately filed a reconsideration request. Google says that it will take them a while to take any sort of action on a reconsideration request, but within 24 hours, our site was re-indexed, and this time we showed up as #1 for our target keywords.
Top Things to Look At If Your Site Has Problems Being Indexed by Google
Based on our experience, here are some things to look at if your site is not being Indexed by Google.
- Check your robots.txt file
- Use Google Webmaster Tools
- Check to see if a manual action has been made against your site