用Python写一个爬取电影的代码
时间: 2024-05-18 07:18:16 浏览: 165
好的,以下是一个简单的Python爬虫代码,用于从IMDb网站上获取电影信息:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://www.imdb.com/chart/top?ref_=nv_mv_250'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
movies = soup.select('td.titleColumn')
crew = [a.attrs.get('title') for a in soup.select('td.titleColumn a')]
ratings = [b.attrs.get('data-value') for b in soup.select('td.posterColumn span[name="ir"]')]
for i in range(0, len(movies)):
movie_string = movies[i].get_text()
movie = (' '.join(movie_string.split()).replace('.', ''))
movie_title = movie[len(str(i))+1: -7]
year = movie[-6:-2]
place = movie[:len(str(i)) - (len(movie))]
print(f'{place} - {movie_title} ({year}) - Starring: {crew[i]} - Rating: {ratings[i]}')
```
这段代码使用了requests和BeautifulSoup库,首先访问IMDb网站上的电影排行榜,然后使用BeautifulSoup解析HTML内容。接着,从HTML中选择所有电影的标题、演员、评分等信息,最后将它们打印出来。
阅读全文