Twitter's robots.txt question:

Hemat

Junior Member

Join Date: Dec 2010

Posts: 1
- Share
- Tweet
#1

Twitter's robots.txt question:

25 December 2010, 07:00 AM

Twitter's robots.txt, It shows everything is disallowed, but surprisingly search engines are crawling and indexing everybody's profiles pages, Why?
Tags: None
taylortbb

Junior Member

Join Date: Jan 2011

Posts: 1
- Share
- Tweet
#2

08 January 2011, 10:12 PM

Twitter doesn't disallow all URLs. If they wanted to do that they'd do "Disallow: /". Their "Disallow: /*?" disallows all pages with a ? in the URL for robots that recognize wildcards (Google and Yahoo only as far as I know). For others the * is interpreted like any other character would be. Twitter profile, tweet, etc pages don't use a ? in the URL (no HTTP GET parameters) so search engines index them.
Comment

Announcement