In below code,
len(self.crawler.engine.slot.scheduler)is always returning 0- and
self.crawler.engine.slot.scheduler.stats._stats['scheduler/enqueued']is returning value in increasing order: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
I was expecting the queue to be high initially and in decreasing order as URLs get crawled. Higher queue before crawling and lower value of queue after crawling.
Also, uncommenting this code shows similar trend of increasing queue size.
if next_page is not None: next_page = response.urljoin(next_page) yield scrapy.Request(next_page, callback=self.parse)note: I have set CONCURRENT_REQUESTS = 1 in settings
import scrapyclass QuotesSpider(scrapy.Spider): name = "quotes_spider" start_urls = ["https://quotes.toscrape.com/page/1/","https://quotes.toscrape.com/page/2/","https://quotes.toscrape.com/page/3/","https://quotes.toscrape.com/page/4/","https://quotes.toscrape.com/page/5/","https://quotes.toscrape.com/page/6/","https://quotes.toscrape.com/page/7/","https://quotes.toscrape.com/page/8/","https://quotes.toscrape.com/page/9/","https://quotes.toscrape.com/page/10/", ] def parse(self, response): print(f"\n before {self.crawler.engine.slot.scheduler.stats._stats['scheduler/enqueued']} \n\n") print(f"\n before2 {len(self.crawler.engine.slot.scheduler)}") # dont know why it always returns zero for quote in response.css("div.quote"): yield {"text": quote.css("span.text::text").get(),"author": quote.css("small.author::text").get(),"tags": quote.css("div.tags a.tag::text").getall(), } next_page = response.css("li.next a::attr(href)").get() if next_page is not None: next_page = response.urljoin(next_page) yield scrapy.Request(next_page, callback=self.parse) print(f"\n After {self.crawler.engine.slot.scheduler.stats._stats['scheduler/enqueued']} \n\n") print(f"\n after2 {len(self.crawler.engine.slot.scheduler)}") # dont know why it always returns zerothis is the original question (I could not comment there because of low reputation): How to get the number of requests in queue in scrapy?
scrapy code copied from: https://docs.scrapy.org/en/latest/intro/tutorial.html