获取猫眼电影实时票房爬虫,提取电影名,总票房,当日综合票房,当日分账票房,票房占比,排片场次,并将数据保存到Excel中
时间: 2024-06-09 12:05:07 浏览: 17
好的,需要用到Python编程语言和相关的第三方库,具体步骤如下:
1. 安装requests、BeautifulSoup和openpyxl库
使用以下命令安装:
```
pip install requests
pip install beautifulsoup4
pip install openpyxl
```
2. 发送请求获取数据
通过分析猫眼电影网站的网页结构,我们可以使用requests库发送GET请求获取实时票房数据。具体代码如下:
```python
import requests
url = 'https://box.maoyan.com/promovie/api/box/second.json'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
data = response.json()
```
3. 解析数据
使用BeautifulSoup库解析数据,获取电影名、总票房、当日综合票房、当日分账票房、票房占比和排片场次。具体代码如下:
```python
from bs4 import BeautifulSoup
soup = BeautifulSoup(data['data'], 'html.parser')
movies = soup.find_all('div', class_='movie-box')
results = []
for movie in movies:
name = movie.find('div', class_='movie-name').text.strip()
total_box = movie.find_all('span', class_='stonefont')[0].text
today_box = movie.find_all('span', class_='stonefont')[1].text
split_box = movie.find_all('span', class_='stonefont')[2].text
box_rate = movie.find('span', class_='stonefont').parent.find_all('span')[1].text
show_count = movie.find_all('p')[1].text.strip()
results.append([name, total_box, today_box, split_box, box_rate, show_count])
```
4. 将数据保存到Excel中
使用openpyxl库将数据保存到Excel中。具体代码如下:
```python
from openpyxl import Workbook
wb = Workbook()
ws = wb.active
ws.append(['电影名', '总票房', '当日综合票房', '当日分账票房', '票房占比', '排片场次'])
for result in results:
ws.append(result)
wb.save('maoyan_box.xlsx')
```
完整代码如下:
```python
import requests
from bs4 import BeautifulSoup
from openpyxl import Workbook
url = 'https://box.maoyan.com/promovie/api/box/second.json'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
data = response.json()
soup = BeautifulSoup(data['data'], 'html.parser')
movies = soup.find_all('div', class_='movie-box')
results = []
for movie in movies:
name = movie.find('div', class_='movie-name').text.strip()
total_box = movie.find_all('span', class_='stonefont')[0].text
today_box = movie.find_all('span', class_='stonefont')[1].text
split_box = movie.find_all('span', class_='stonefont')[2].text
box_rate = movie.find('span', class_='stonefont').parent.find_all('span')[1].text
show_count = movie.find_all('p')[1].text.strip()
results.append([name, total_box, today_box, split_box, box_rate, show_count])
wb = Workbook()
ws = wb.active
ws.append(['电影名', '总票房', '当日综合票房', '当日分账票房', '票房占比', '排片场次'])
for result in results:
ws.append(result)
wb.save('maoyan_box.xlsx')
```
运行代码后,会在当前目录下生成一个名为maoyan_box.xlsx的Excel文件,里面包含了实时票房数据。
相关推荐
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)