利用python爬取豆瓣top前250电影
时间: 2023-07-06 13:26:41 浏览: 48
好的,以下是爬取豆瓣 Top 250 电影并输出到 CSV 文件的 Python 代码:
```python
import requests
from bs4 import BeautifulSoup
import csv
url = 'https://movie.douban.com/top250'
def get_movies(url):
res = requests.get(url)
soup = BeautifulSoup(res.text, 'html.parser')
movie_list = soup.find('ol', {'class': 'grid_view'})
movies = []
for movie_li in movie_list.find_all('li'):
movie = {}
movie_hd = movie_li.find('div', {'class': 'hd'})
movie['title'] = movie_hd.find('span', {'class': 'title'}).text
movie_info = movie_li.find('div', {'class': 'bd'})
movie['rating'] = movie_info.find('span', {'class': 'rating_num'}).text
movie['quote'] = movie_info.find('span', {'class': 'inq'}).text
movies.append(movie)
return movies
movies = []
for i in range(0, 250, 25):
movies.extend(get_movies(url + '?start=' + str(i)))
with open('douban_top250_movies.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(['电影名称', '评分', '简介'])
for movie in movies:
writer.writerow([movie['title'], movie['rating'], movie['quote']])
```
这段代码会从豆瓣 Top 250 电影页面爬取电影的名称、评分和简介,并将其输出到名为 `douban_top250_movies.csv` 的 CSV 文件中。