How Google Indexes Your Web Pages

* posts may contain affiliate links - what does that mean?

Thinking About Starting Your OWN Blog? If YOU'D like to learn how to make money blogging with affiliate marketing like I do, feel free to read my Free Blogging 101 E-Course here.


Help For Bloggers | Get Daily Pinterest Tips
Get Free Training, TWO Free Blogs, and Free Tools

Google Hocus Pocus Part 2 –

One day your lens is in Google, next day it’s gone. Sound familiar? While we’ve talked about why your lens falls out of Google before, I think it’s time we talk about it on a deeper level. I’ve hesitated doing this mainly because it is a more advanced topic – but also because Google does what Google wants. There is no absolute understanding of Google – and I don’t want anyone to obsess with trying to do just that. But, let’s talk about ol’ Googlebot and your web pages.

Rule #1 – You Can’t Understand Google

Cold, hard fact is this… it’s Google’s search engine and they’ll index however the heck they want to. They do not WANT us to totally understand how it all works. Heck, I read that last year (2008), Google changed their algorithm over 450 times! That is at least one change every DAY.

There is no “code” to crack with Google. Whatever the “code” is today WILL BE different tomorrow. The sooner you accept that, the better off you’ll be.

Rule #2 – The Definition of Insanity Is TRYING To Understand Google

Seriously, the more and more you try and “figure out” Google, the crazier you will make yourself. All my men readers – C’mon guys… how many times have you looked at your wife or girlfriend and thought, “I am going crazy trying to figure you out!” ???

Honey, we women have NOTHING on Google. Google is the most temperamental “woman” you will EVER come across. Worst yet, Google has absolutely NO desire for you to figure out a THING about her. Ironically, the most temperamental woman in the world was developed by two men in a dorm room. (makes me laugh a little)

How Google Indexes Web Pages

With all those disclaimers and warnings out of the way, let’s talk about Google and web pages. Did you know it is Google’s objective to index every page on the web?

This is a HUGE job.

They also take it one step further by trying to rank these web pages in order of relevance to a search query. Just think about all the information that is processed in that 0.15 seconds it took to give you your search results! It’s amazing to me – astounding!

Now, Google also wants to find all the newest and freshest online content and get it in their index asap, too. Yet another HUGE under-taking. Just think how many people are publishing blog posts and articles and lenses and videos and ALLLLLL that every single second of every single day!

Google’s crawlers (GoogleBot) are out on the web all the time looking for new content, updating older content – just spidering the web like crazy.

In fact, GoogleBot actually has two types of crawlers – FreshBot and DeepBot (no, I am not making this up…lol)

Meet FreshBot and DeepBot

Freshbot has one job – find all the newest and the freshest content as it can and get it in the Google index.

DeepBot has a monthly job – to deep crawl all the web pages on the internet, follow the links, evaluate the web pages, and then completely re-update the entire index and get it to all the Google data centers.

Let me use my blog as an example –

Good ol’ FreshBot is hanging around here or at least waiting on me all the time. I love FreshBot! Within minutes of me publishing a new post, here comes ol’ FreshBot. He grabs it and gets it in the Google index within about 30 minutes. Then, he will tell ol’ DeepBot to get over here to PotPieGirl.com and be sure to do a deep index of my site each month.

DeepBot also “remembers” to come do a deep crawl here by finding my blog from other links all over the internet (remember how important I said back links are???)

FreshBot finds you and gives you a bonus of getting in the index quickly. DeepBot decides what to do with you. In other words, DeepBot checks your back links and ALL that stuff that FreshBot could care less about.

What Does This GoogleBot Activity Look Like?

Now I want to show you an image from inside my Google WebMaster Tools. This is an image of the number of pages that have been crawled per day in the last 90 days here at PotPieGirl.com.

As you can see, ol’ FeshBot is in here daily checking for new content, new posts, new comments, etc etc. But, see that BIG spike there in the beginning of December? That is DeepBot coming in for a BIG look at my site. I’ve recently had two more big scans. I first realized this was happening here at PotPieGirl.com when I got a nice little “Warning!” message about my bandwidth usage from my hosting. Yes, DeepBot can eat up some bandwidth. The better your Page Rank, the deeper they go….and the more bandwidth they use.

I can only imagine what the big sites with high Page Rank go thru! You know, sites like Squidoo.com

FreshBot, DeepBot, and Your Squidoo Lens

Ok – now Squidoo has a nice high Page Rank of 8. FreshBot hangs around there A LOT- probably all the time. He KNOWS there is constantly new content to crawl on that site.

Problem is, sometimes he misses stuff when crawling the Squidoo site. This is why we get some links to our new lenses to help FreshBot get there. Or we could do some edits and republish helping to remind FreshBot that we’re there. Even if we do all that, he still might miss. Hey, he’s a computer program…give him a break…lol

OR, FreshBot DOES find your new lens and you are in that Google index in a matter of hours – or minutes! Awesome! BUT – then days go by and DeepBot hasn’t been….and your new lens falls out of the index.

OR, FreshBot DOES get to you and DeepBot does, too – BUT, ol’ DeepBot didn’t find or wasn’t aware of any back links TO your content – and your lens falls out again. Remember now, just because YOU know there is a back link to your lens does NOT mean that DeepBot knows about it yet. And also keep in mind that you may SEE your back link in the Google index, but that could be the work of FreshBot – not DeepBot.

Regardless, it’s Google’s party and they’ll crawl if they want to =)

Bottom Line

When your lens is new and if appears to be struggling with Google, keep it fresh. I’m not talking about major updates – just a little sumthin-sumthin and republish. Also, keep making those links TO your lens (preferably from places that FreshBot hangs out).

All this with FreshBot and DeepBot applies to article marketing and blogs and ALL web sites. I know it’s confusing, but the best tip I can give to understand all this is to simply accept that you will never completely understand all this.

Good content that is well-optimized with links pointing TO that content WILL come back into the Google index. The higher the Page Rank of the site your content sits on, the better your odds of being found and STICKING in the Google index. The only thing that WON’T work to bring your lens back into the Google index is obsessing over why it’s not in the Google index. As I always say – KEEP MOVING FORWARD =)

Just for the record, I do not KNOW all this. Nor do I personally KNOW anyone at Google who has told me all this. These are my observations and the results of my reading and testing. You can read more about the GoogleBot and all her buddies here.

No Ratings Yet

Whatcha Think?

15 Comments

    • PotPieGirl
  1. Neo
  2. Jen
  3. Tonya
  4. Leo
    • PotPieGirl

Need a Better Pinterest Strategy? ===>> Try This!