用 scrapy 写的,碰到个问题,运行的时候,没有经过 pipelines 页面
wincos 为主目录
wincos/spiders/win4.py 内容是:
import scrapy from wincos.items import WincosItem from scrapy.http import Request
class Win4Spider(scrapy.Spider): name = 'win4' allowed_domains = ['www.win4000.com'] start_urls = ['http://www.win4000.com/meinvtag26_1.html']
def parse(self, response):
mtitem = WincosItem()
mtitem['title'] = response.xpath("//a/img/@src").extract() #标题
# http://www.win4000.com/meinv
print("================")
print(mtitem['title'])
yield mtitem
for i in range(1,3):
url="http://www.win4000.com/meinvtag26_"+str(i)+".html"
print(url)
yield Request(url,callback=self.parse)
items 页面内容是: import scrapy class WincosItem(scrapy.Item): title = scrapy.Field()
pipelines 页面是: class WincosPipeline(object): def process_item(self, item, spider): print("===========88888888============") print(item) for i in range(0,len(item['title'])): print("===========666666============") print(item['title'][i]) return item
运行得到的数据是{'title':['所有的图片']
但是没有进入 pipelines 里面来,不知道问题在哪。想保存数据进来
1
wuyifar 2020-03-05 11:10:00 +08:00
settings.py 这个文件中的 ITEM_PIPELINES 设置了吗, 优先级调高一点看一下
|
4
Dustyposa 2020-03-05 17:04:19 +08:00
`Path(name).write_bytes()`
存图片 |