Python获取猫眼电影实时票房爬虫,提取电影名,总票房,当日综合票房,当日分账票房,票房占比,排片场次,并将数据保存到Excel中
时间: 2024-06-09 13:05:07 浏览: 210
好的,这个任务可以使用Python的爬虫库`requests`和`beautifulsoup4`来实现。同时可以使用`openpyxl`库来将数据存储到Excel中。以下是实现的具体步骤:
1. 导入需要用到的库
```python
import requests
from bs4 import BeautifulSoup
import openpyxl
from openpyxl import Workbook
```
2. 定义请求头和请求URL,并发送请求获取HTML页面内容
```python
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
url = 'https://box.maoyan.com/promovie/api/box/second.json'
response = requests.get(url, headers=headers)
```
3. 解析HTML页面内容,并提取需要的数据
```python
soup = BeautifulSoup(response.content, 'html.parser')
movies = soup.select('.movie-list > li')
for movie in movies:
# 电影名
name = movie.select_one('.movie-name').get_text().strip()
# 总票房
total_box_office = movie.select_one('.box-desc > .total-box-desc > .box-num').get_text().strip()
# 当日综合票房
daily_box_office = movie.select_one('.box-desc > .box-desc-item:nth-child(2) > .box-num').get_text().strip()
# 当日分账票房
split_box_office = movie.select_one('.box-desc > .box-desc-item:nth-child(3) > .box-num').get_text().strip()
# 票房占比
box_office_ratio = movie.select_one('.box-desc > .box-desc-item:nth-child(4) > .box-num').get_text().strip()
# 排片场次
show_count = movie.select_one('.movie-detail > .detail-item:nth-child(3) > .detail-num').get_text().strip()
```
4. 将数据保存到Excel中
```python
# 创建Excel文件对象
wb = Workbook()
# 获取当前活跃的worksheet
ws = wb.active
# 设置表头
ws.append(['电影名', '总票房', '当日综合票房', '当日分账票房', '票房占比', '排片场次'])
for movie in movies:
# 提取数据
name = movie.select_one('.movie-name').get_text().strip()
total_box_office = movie.select_one('.box-desc > .total-box-desc > .box-num').get_text().strip()
daily_box_office = movie.select_one('.box-desc > .box-desc-item:nth-child(2) > .box-num').get_text().strip()
split_box_office = movie.select_one('.box-desc > .box-desc-item:nth-child(3) > .box-num').get_text().strip()
box_office_ratio = movie.select_one('.box-desc > .box-desc-item:nth-child(4) > .box-num').get_text().strip()
show_count = movie.select_one('.movie-detail > .detail-item:nth-child(3) > .detail-num').get_text().strip()
# 将数据写入Excel
ws.append([name, total_box_office, daily_box_office, split_box_office, box_office_ratio, show_count])
# 保存Excel文件
wb.save('box_office.xlsx')
```
完整代码如下:
阅读全文