run:
settings = get_project_settings()
process = CrawlerProcess(settings)
for section in config:
process.crawl(Spider, section)
process.start()
log:
[scrapy.extensions.logstats] INFO: Crawled 15964 pages (at 0 pages/min), scraped 12820 items (at 0 items/min)
[scrapy.extensions.logstats] INFO: Crawled 15964 pages (at 0 pages/min), scraped 12820 items (at 0 items/min)
[scrapy.extensions.logstats] INFO: Crawled 15964 pages (at 0 pages/min), scraped 12820 items (at 0 items/min)
[scrapy.extensions.logstats] INFO: Crawled 15964 pages (at 0 pages/min), scraped 12820 items (at 0 items/min)
[scrapy.extensions.logstats] INFO: Crawled 15964 pages (at 0 pages/min), scraped 12820 items (at 0 items/min)
......
1
panyanyany 2017-05-13 10:23:33 +08:00
你用的是 python 几?如果是 py3,可以试下这个工具: https://opensourcehacker.com/2015/04/16/inspecting-thread-dumps-of-hung-python-processes-and-test-runs/
|
2
dsg001 OP @panyanyany py3.5,版本的问题吗? 想了解到底是啥原因导致的,这些爬虫如果单独执行完全没有问题,而且强制 kill 进程会导致一些 spider_closed 无法执行
|