给我写一个完整的scrapy爬虫爬取猫眼电影的代码
时间: 2024-10-10 18:07:24 浏览: 45
在Scrapy框架中编写一个爬取猫眼电影信息的完整代码需要几个步骤。首先,确保已经安装了Scrapy和相关的中间件、下载器和解析器。这里是一个基础的示例,它会抓取电影列表页的数据:
```python
# 导入必要的模块
import scrapy
class MaoyanMovieSpider(scrapy.Spider):
name = 'maoyan_movie'
allowed_domains = ['maoyan.com']
start_urls = ['https://maoyan.com/board/4'] # 猫眼电影首页
def parse(self, response):
# 解析电影列表
movie_list = response.css('div.movie-item') # 根据HTML结构选择电影元素
for movie in movie_list:
title = movie.css('a.title::text').get() # 提取电影标题
rating = movie.css('.rating_num::text').get() # 提取评分
link = movie.css('a.title::attr(href)').get() # 提取链接到详情页
yield {
'title': title,
'rating': rating,
'link': response.urljoin(link), # 合并URL,防止相对路径问题
}
# 如果有下一页,继续请求
next_page = response.css('a.next::attr(href)').get()
if next_page is not None:
yield response.follow(next_page, self.parse) # 跟随链接到下一页
# 相关问题:
阅读全文