用Python爬虫豆瓣电影
时间: 2023-07-12 12:56:10 浏览: 206
好的,我可以为您提供一个基本的豆瓣电影爬虫的框架,您可以根据您的需求进行修改和扩展。以下是一个简单的示例代码:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://movie.douban.com/top250'
def get_movies(url):
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
movie_list_soup = soup.find('ol', attrs={'class': 'grid_view'})
movie_name_list = []
for movie_li in movie_list_soup.find_all('li'):
detail = movie_li.find('div', attrs={'class': 'hd'})
movie_name = detail.find('span', attrs={'class': 'title'}).getText()
movie_name_list.append(movie_name)
return movie_name_list
movies = get_movies(url)
for movie in movies:
print(movie)
```
这段代码使用了requests库和BeautifulSoup库来获取豆瓣电影top250的电影名称列表,您可以根据您的需求进行修改和扩展。请注意,爬取网站的时候应该遵守网站的爬取规则和法律法规。
阅读全文