首页beautifulsoup爬取列表信息案例

beautifulsoup爬取列表信息案例

时间: 2023-11-08 18:52:27 浏览: 93

下面是一个使用BeautifulSoup爬取列表信息的简单案例：假设我们要从一个网页中获取电影列表信息，包括电影名称、导演和主演。我们可以按照以下步骤进行操作： 1. 使用requests库获取网页内容： ```python import requests url = 'http://www.example.com/movies' response = requests.get(url) html_content = response.text ``` 2. 使用BeautifulSoup解析网页内容： ```python from bs4 import BeautifulSoup soup = BeautifulSoup(html_content, 'html.parser') ``` 3. 找到包含电影信息的列表元素： ```python movies_list = soup.find('ul', {'class': 'movies-list'}) ``` 4. 遍历列表元素中的每个电影信息，获取电影名称、导演和主演： ```python movies = [] for movie in movies_list.find_all('li'): name = movie.find('h3').text director = movie.find('div', {'class': 'director'}).text actors = [actor.text for actor in movie.find('div', {'class': 'actors'}).find_all('a')] movies.append({'name': name, 'director': director, 'actors': actors}) ``` 最终得到的电影列表信息将会保存在`movies`变量中，每个电影信息都是一个字典，包含电影名称、导演和主演。

阅读全文