Skip to content
This repository has been archived by the owner on Nov 14, 2019. It is now read-only.

unexpected behavior of robots_txt option #134

Open
viktor-svirsky opened this issue Nov 20, 2017 · 1 comment
Open

unexpected behavior of robots_txt option #134

viktor-svirsky opened this issue Nov 20, 2017 · 1 comment

Comments

@viktor-svirsky
Copy link

Hi @johtani, I have faced with unexpected behavior when I try to grub site with an enabled robots_txt option.

robots.txt like:
User-agent: *
Disallow: /

and my expected result that the site will not be crawled.

I have tried to change user agent as
User-agent: River Web
User-agent: RiverWeb,

and there are results.

Please advise.

@marevol
Copy link
Contributor

marevol commented Nov 26, 2017

The behavior depends on a crawling configuration.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants