Google Indexing Speed

RJM62

Touchdown! Greaser!
Joined
Jun 15, 2007
Messages
13,157
Location
Upstate New York
Display Name

Display name:
Geek on the Hill
I put up a blog last week, and have been writing articles pretty much every day. I submitted it to search engines in the usual way, and didn't expect it to be indexed for some time. I mean, who really cares what I have to say?

I was very surprised to find the blog indexed within a day after I submitted it. That was kinda cool. But what absolutely stunned me was when half an hour ago I published an article, and then searched for the article title just 19 minutes later. The article was indexed already. I was flabbergasted.

I closed that window, but just repeated the search so I could take a screenshot. 37 minutes.

What prompted me to search for the article was that I'd noticed that the Adsense ads were relevant to the content the first time I opened the article -- less than a minute after I posted it. Typically, when I change or add a page to a "regular" site, the default ads for the site come up for a while until the new or changed page is crawled.

I was surprised that they'd crawled the page for Adsense so quickly. But I was floored when I found that the article itself had been indexed so quickly.

Anyone know how Google does this? Do they just crawl blogs continuously, or does WordPress have some code built in that notifies search engines when a new article is published?

-Rich
 

Attachments

  • blog_ss.jpg
    blog_ss.jpg
    125.6 KB · Views: 13
Last edited:
How do you submit to a search engine?

Mine get indexed really quickly too. I wrote my latest blog post about my meal and within an hour I got the google alert telling me of the latest posting to my blog.

I don't have advertisements on mine, either.
 
POA gets indexed pretty fast. Because of my job I have a Google Alert on myself. I have posted here and gotten a Google alert in a minute or two.
 
How do you submit to a search engine?

Mine get indexed really quickly too. I wrote my latest blog post about my meal and within an hour I got the google alert telling me of the latest posting to my blog.

I don't have advertisements on mine, either.

Each search engine has a submission page. It's not usually necessary to fill it out; the crawlers usually find the sites on their own the first time they're linked from some other site (or in some cases, when they link to another site that keeps track of such things).

Google's submission page is http://www.google.com/addurl/ . But the blog also has been crawled by other search engines that I didn't submit it to.

I'm also told that simply searching on a brand new site's site's domain name triggers a crawl. I don't know if that's true, but it would make sense.

-Rich
 
Anyone know how Google does this?

Yup. (I worked on the indexing team for 3 years.) Sadly, I can't share details -- but one of our goals is to ensure that when content changes on the web it is reflected in our index as promptly as possible. Nice to hear that sometimes it works. :-)

Oh, and thank you. :-)

Chris
 
Yup. (I worked on the indexing team for 3 years.) Sadly, I can't share details -- but one of our goals is to ensure that when content changes on the web it is reflected in our index as promptly as possible. Nice to hear that sometimes it works. :-)

Oh, and thank you. :-)

Chris

Is there a sensible way to remove the indexing from the bots? As in don't rifle through my site kind of thing?
 
Yup. (I worked on the indexing team for 3 years.) Sadly, I can't share details -- but one of our goals is to ensure that when content changes on the web it is reflected in our index as promptly as possible. Nice to hear that sometimes it works. :-)

Oh, and thank you. :-)

Chris

So far it works pretty well. But I guess I'll have to look through the server logs to find the answer to my question. :rolleyes:

-Rich
 
Is there a sensible way to remove the indexing from the bots? As in don't rifle through my site kind of thing?

Frank:

Yes, with a robots.txt file: http://www.robotstxt.org/

There are also ways to stop the index-bots from ranking based on a link with a no-follow attribute: http://en.wikipedia.org/wiki/Nofollow
I've found that most spiders ignore robots.txt. The good guys like google will stop there but there are a lot of crawl bots out there.

Joe
 
Okay, this is amazing. I posted a new article, got up to feed the turtle, and the article was indexed by the time I got back to my desk.

Three minutes. Almost scary.

-Rich
 

Attachments

  • blog_ss2.jpg
    blog_ss2.jpg
    148.2 KB · Views: 10
Back
Top