Implementing fetchet.server.min.delay customization per domain/host/ip #889
jcruzmartini
started this conversation in
Ideas
Replies: 1 comment 1 reply
-
that could be a valuable contrib, thanks @jcruzmartini |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi @jnioche, we have been dealing with some issues with a specific domain that has tons of URLs. Having in mind the
fetcher.server.min.delay
is the min time that we want to wait until hitting this domain again, I am wondering if you consider that may be a good idea to make this parameter configurable by domain/host/ip exactly the same that you are doing for fetch max threadsfetcher.maxThreads.host/domain/ip
I am thinking in something like:
fetcher.server.min.delay.host/domain/ip
fetcher.server.delay.host/domain/ip
so this will allow us to customize the crawler to hit a specific domain with more frequency than the rest of the domains/ip/host
I can create a pull request with this proposal if you consider that is a feature that may be useful for the project.
Thanks
Juan
Beta Was this translation helpful? Give feedback.
All reactions