我想使用python在评分网站爬取一些影片的影片名称、年份、季度、评分和评分人数,并保存为excel表格,你能写一个程序实现这一点吗
时间: 2024-04-08 08:36:09 浏览: 44
当然可以!以下是一个示例程序,使用Python的requests和BeautifulSoup库来爬取评分网站的影片信息,并将结果保存为Excel表格:
```python
import requests
from bs4 import BeautifulSoup
import openpyxl
# 发起网络请求获取网页内容
url = "https://example.com/movies" # 替换为实际的评分网站URL
response = requests.get(url)
html_content = response.text
# 使用BeautifulSoup解析网页内容
soup = BeautifulSoup(html_content, "html.parser")
# 查找影片信息并保存到列表中
movies = []
movie_elements = soup.find_all("div", class_="movie")
for movie_element in movie_elements:
name = movie_element.find("h2").text
year = movie_element.find("span", class_="year").text
quarter = movie_element.find("span", class_="quarter").text
rating = movie_element.find("span", class_="rating").text
num_ratings = movie_element.find("span", class_="num-ratings").text
movie_info = {
"name": name,
"year": year,
"quarter": quarter,
"rating": rating,
"num_ratings": num_ratings
}
movies.append(movie_info)
# 创建Excel表格并写入数据
wb = openpyxl.Workbook()
sheet = wb.active
# 写入表头
headers = ["影片名称", "年份", "季度", "评分", "评分人数"]
for col_num, header in enumerate(headers, 1):
sheet.cell(row=1, column=col_num, value=header)
# 写入影片信息
for row_num, movie in enumerate(movies, 2):
sheet.cell(row=row_num, column=1, value=movie["name"])
sheet.cell(row=row_num, column=2, value=movie["year"])
sheet.cell(row=row_num, column=3, value=movie["quarter"])
sheet.cell(row=row_num, column=4, value=movie["rating"])
sheet.cell(row=row_num, column=5, value=movie["num_ratings"])
# 保存Excel表格
wb.save("movies.xlsx")
```
请将代码中的`url`替换为你实际要爬取的评分网站的URL,然后运行代码即可将影片信息保存为名为"movies.xlsx"的Excel表格文件。
阅读全文