
This morning, I spotted a group of SEOs arguing about if a URL can be "indexed" if the page is blocked by a noindex tag or robots.txt file?It is a valid SEO question but when you ask that in a room of SEO geeks, the responses can get pretty wild.Youve all seen examples of URLs in the search results that just list the URL but not the actual title tag and snippet of the page. That is typically because Google has a copy of the URL in their database as a reference but has not crawled or indexed the content on the web page because it is restricted to do so for one reason or another.The question is, is that URL considered indexed or not? That depends on the definition of "indexed" and which SEO you ask.Let me share the tweets about this:Yep -> RT @rustybrick: @AndyBeard but a page URL can be indexed without being crawled @seosmarty @rishil
@rustybrick @seosmarty @rishil In google terminology that is just a reference
@AndyBeard but a page URL can be indexed without being crawled @seosmarty @rishil
@AndyBeard but a page URL can be indexed without being crawled @seosmarty @rishil
@rishil indexation=appears in search results (index). crawling=going through the page itself - right? cc @AndyBeard
@seosmarty @rishil A page has to be crawled first to be indexed... a reference link != indexed
@rishil indexation=appears in search results (index). crawling=going through the page itself - right? cc @AndyBeard
The discussion went on for dozens and dozens of tweets with no one winning.Matt or John, want to chime in and give the final answer?Forum discussion at, um, Twitter.
If a reference link has not been crawled it has not been indexed, but it can still appear in the results. Chances are though you will never see one, why? Google has zero content for it. A lot of the argument here is simply over syntax.
No brainer. If its indexed without displaying a proper snippet, its indexed but not crawled. If it displays meta info or Google generated info in the snippet, its been crawled and indexed.
I pulled out of these sort of discussions a few years ago because of the never ending aspect they have. Its basically a discussion about semantics!
For me, indexed = first crawl and crawling is the continuous checking of bots per update. cache is involve (the snapshot when was the last crawl takes place) :D
WebmasterWorld Weekly Round-Up 13 DecemberWordPress 3.8 Is Out, And It Has a Dashboard Facelift
How to create images like thisTop 10 Adult Affiliate Networks