class PicSpiderPipeline(ImagesPipeline): def get_media_requests(self, item, info): image_requests = super().get_media_requests(item,info) for img_req in image_requests: img_req.item = item return image_requests def file_path(self, request, response=None, info=None, *, item=None): old_path = super().file_path(request,response,info) title = request.item['title'] save_path = os.path.join(settings.IMAGES_STORE,title[0].replace('|','')) image_name = old_path.replace('full/','') return os.path.join(save_path,image_name) 此代码中title中包含这特殊的字符,当没有写上.replace('|','')时代码报错,代码没有报错,只是没有下载图片,在实际中如何查出这个问题,并打印出没有下载图片的原因
时间: 2024-04-08 18:35:31 浏览: 65
要查找没有下载图片的原因,可以在`PicSpiderPipeline`类中添加一些调试信息,例如在`get_media_requests`方法中打印`title`的值,以及在`file_path`方法中打印`old_path`的值。
```python
class PicSpiderPipeline(ImagesPipeline):
def get_media_requests(self, item, info):
image_requests = super().get_media_requests(item, info)
for img_req in image_requests:
img_req.item = item
print("Title:", item['title'])
return image_requests
def file_path(self, request, response=None, info=None, *, item=None):
old_path = super().file_path(request, response, info)
print("Old Path:", old_path)
title = request.item['title']
save_path = os.path.join(settings.IMAGES_STORE, title[0].replace('|', ''))
image_name = old_path.replace('full/', '')
return os.path.join(save_path, image_name)
```
运行代码后,查看输出中的调试信息,可以定位到出错的原因。如果没有输出相关信息,可能是该部分代码没有被执行到,需要检查其他可能的问题,如爬虫的配置、数据传递等。
阅读全文