scott: regarding caching of robots.txt, i’d prefer
1st priority: “Expires” header
2nd priority: UTC rule as above
sounds reasonable (and not too difficult to implement) for
Feedster. If Feedster indexes a domain.com in a session-like
manner, fetching /robots.txt once per session as 2nd priority
would probably be reasonable as well.
(excerpt of my reply of july 19)
reasons: it’s more blogger friendly (handing over control of caching to them) and it makes more sense in modern (short-lived) times. regarding images: remember robots.txt addresses any kind of files (it specifies retrieval based on location, not content). if you plan to offer some fine-grained copyright handling in addition, robots.txt should always be respected nevertheless (it’s the only indexing standard we currently have).