用python写一段代码,爬取酷狗TOP500数据爬取
时间: 2023-10-06 14:07:52 浏览: 119
好的,以下是用 Python 爬取酷狗 TOP500 数据的示例代码:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://www.kugou.com/yy/rank/home/1-8888.html?from=rank'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
song_list = soup.select('.pc_temp_songlist > ul > li')
for song in song_list:
rank = song.select_one('.pc_temp_num').text.strip()
name = song.select_one('.pc_temp_songname').text.strip()
singer = song.select_one('.pc_temp_artist').text.strip()
print(f'{rank}. {name} - {singer}')
```
这段代码通过 requests 库向酷狗 TOP500 首页发送 HTTP 请求,得到响应后用 BeautifulSoup 库解析 HTML 文档。然后,我们通过 CSS 选择器选出歌曲列表中的每一首歌曲,并从中提取出歌曲名和歌手名,最终输出歌曲名单。
相关问题
请用Python写出一段代码爬取豆瓣电影TOP30数据
好的,以下是用Python爬取豆瓣电影TOP30数据的代码:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://movie.douban.com/top250'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
movies = soup.select('.info')
for movie in movies:
title = movie.select_one('.title').text.strip()
rating = movie.select_one('.rating_num').text.strip()
quote = movie.select_one('.quote').text.strip()
print(f'Title: {title}\nRating: {rating}\nQuote: {quote}\n')
```
运行代码后,会输出豆瓣电影TOP30中每一部电影的标题、评分和经典台词。
python爬取豆瓣top250数据代码
以下是用 Python 爬取豆瓣 Top250 电影数据的代码示例:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://movie.douban.com/top250'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
def get_movies(url):
movie_list = []
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
movies = soup.find_all('div', class_='item')
for movie in movies:
movie_name = movie.find('span', class_='title').text
movie_rating = movie.find('span', class_='rating_num').text
movie_list.append((movie_name, movie_rating))
next_page = soup.find('span', class_='next').find('a')
if next_page:
return movie_list, url + next_page['href']
return movie_list, None
while url:
movies, url = get_movies(url)
for movie in movies:
print(movie[0], movie[1])
```
这段代码会从豆瓣 Top250 页面开始,使用 BeautifulSoup 解析网页内容,并提取出电影的名称和评分信息。然后,它会找到下一页的链接,并继续爬取下一页的数据,直到所有页面都被爬取完毕。
请注意,爬取网页内容时,需要设置合适的 User-Agent,以避免被网站屏蔽。
阅读全文