Implement Scrapy style concurrency controls? #39

philbudne · 2024-02-21T00:28:33Z

Currently rss-fetcher has limits on concurrent requests and minimum interval between connections to rss servers.

My reading of Scrapy (and what I implemented in the queue-based story fetcher under control of SCRAPY_LATENCY) is to keep a moving average of page fetch time for each destination server, and to use AVG_FETCH_TIME/CONCURRENT_CONNECTION_GOAL to calculate the connection interval.

The only time this might matter is when the server has been off-line (down, or off the Internet) and there is a large backlog of feeds overdue for fetching.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Scrapy style concurrency controls? #39

Implement Scrapy style concurrency controls? #39

philbudne commented Feb 21, 2024

Implement Scrapy style concurrency controls? #39

Implement Scrapy style concurrency controls? #39

Comments

philbudne commented Feb 21, 2024