Announcement

Collapse
No announcement yet.

Twitter's robots.txt question:

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Twitter's robots.txt question:

    Twitter's robots.txt, It shows everything is disallowed, but surprisingly search engines are crawling and indexing everybody's profiles pages, Why?

  • #2
    Twitter doesn't disallow all URLs. If they wanted to do that they'd do "Disallow: /". Their "Disallow: /*?" disallows all pages with a ? in the URL for robots that recognize wildcards (Google and Yahoo only as far as I know). For others the * is interpreted like any other character would be. Twitter profile, tweet, etc pages don't use a ? in the URL (no HTTP GET parameters) so search engines index them.

    Comment

    Working...
    X