feedster and robots.txt

feedster now partially supports the robots.txt standard.

scott: regarding caching of robots.txt, i’d prefer

1st priority: “Expires” header
2nd priority: UTC rule as above

sounds reasonable (and not too difficult to implement) for
Feedster. If Feedster indexes a domain.com in a session-like
manner, fetching /robots.txt once per session as 2nd priority
would probably be reasonable as well.

(excerpt of my reply of july 19)

reasons: it’s more blogger friendly (handing over control of caching to them) and it makes more sense in modern (short-lived) times. regarding images: remember robots.txt addresses any kind of files (it specifies retrieval based on location, not content). if you plan to offer some fine-grained copyright handling in addition, robots.txt should always be respected nevertheless (it’s the only indexing standard we currently have).

Leave a Reply

Your email address will not be published. Required fields are marked *

85 − 83 =

This site uses Akismet to reduce spam. Learn how your comment data is processed.