debug: crawled (200) <get http
时间: 2023-05-02 08:00:56 浏览: 398
这是爬虫程序输出的一条信息,意思是程序已经成功爬取了一个URL地址的网页内容,并且返回了200状态码,即页面正常响应。其中的"debug"意味着这是一个调试信息,而"(200)"则是指状态码。"<get http"是请求的网址,具体内容则不详。
相关问题
['MOOCSpider.pipelines.TextPipeline', 'MOOCSpider.pipelines.MongoPipeline'] 2023-05-16 22:37:59 [scrapy.core.engine] INFO: Spider opened 2023-05-16 22:37:59 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2023-05-16 22:37:59 [course] INFO: Spider opened: course 2023-05-16 22:37:59 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023 2023-05-16 22:37:59 [scrapy.core.engine] DEBUG: Crawled (200) <POST https://www.icourse163.org/web/j/mocSearchBean.searchCourse.rpc?csrfKey=e3e6d578b16f42e7afe506ddae0ddd7b> (referer : None)
这是Scrapy爬虫框架输出的日志信息,它告诉我们爬虫程序已经启动并打开了一些管道(pipelines),包括文本管道(TextPipeline)和MongoDB管道(MongoPipeline),同时还监听了一个Telnet控制台。接着它向一个网站发送了一个POST请求,并收到了200的响应码。这个POST请求的目的可能是搜索相关的课程信息。
scrapy 怎么看运行时间
Scrapy默认会在控制台输出运行时间,例如:
```
2019-08-30 11:58:45 [scrapy.core.engine] INFO: Spider opened
2019-08-30 11:58:45 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2019-08-30 11:58:45 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2019-08-30 11:58:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.example.com> (referer: None)
2019-08-30 11:58:46 [scrapy.core.engine] INFO: Closing spider (finished)
2019-08-30 11:58:46 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 226,
'downloader/request_count': 1,
'downloader/request_method_count/GET': 1,
'downloader/response_bytes': 45429,
'downloader/response_count': 1,
'downloader/response_status_count/200': 1,
'elapsed_time_seconds': 0.673051,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2019, 8, 30, 3, 58, 46, 265718),
'log_count/DEBUG': 1,
'log_count/INFO': 10,
'response_received_count': 1,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
'scheduler/enqueued': 1,
'scheduler/enqueued/memory': 1,
'start_time': datetime.datetime(2019, 8, 30, 3, 58, 45, 592667)}
```
其中,`elapsed_time_seconds`就是运行时间,单位为秒。