python实例豆瓣音乐代码_Python使用Beautiful Soup爬取豆瓣音乐排行榜过程解析
时间: 2024-05-17 09:12:31 浏览: 163
好的,以下是使用Python和Beautiful Soup爬取豆瓣音乐排行榜的代码和过程解析:
1. 导入所需的库
```
import requests
from bs4 import BeautifulSoup
```
2. 通过requests库向网站发送请求,获取网页内容
```
url = 'https://music.douban.com/chart'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
html = response.text
```
3. 使用Beautiful Soup解析网页内容
```
soup = BeautifulSoup(html, 'html.parser')
```
4. 定位需要爬取的信息
豆瓣音乐排行榜中包含歌曲的排名、歌曲名称、歌手和评分等信息,我们需要定位到这些信息所在的HTML标签和类名。
```
songs = soup.find_all('tr', class_='item')
```
5. 提取信息并进行存储
```
for song in songs:
rank = song.find('span', class_='ranking_num').get_text().strip()
title = song.find('a', class_='nbg').get_text().strip()
artist = song.find('p', class_='pl').get_text().strip()
rating = song.find('span', class_='rating_nums').get_text().strip()
print(rank, title, artist, rating)
```
完整代码如下:
```
import requests
from bs4 import BeautifulSoup
url = 'https://music.douban.com/chart'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
html = response.text
soup = BeautifulSoup(html, 'html.parser')
songs = soup.find_all('tr', class_='item')
for song in songs:
rank = song.find('span', class_='ranking_num').get_text().strip()
title = song.find('a', class_='nbg').get_text().strip()
artist = song.find('p', class_='pl').get_text().strip()
rating = song.find('span', class_='rating_nums').get_text().strip()
print(rank, title, artist, rating)
```
此代码可爬取豆瓣音乐排行榜前100首歌曲的信息。
阅读全文