爬取周杰伦歌曲作词人作曲人并生成柱形图
时间: 2023-08-13 20:04:55 浏览: 214
好的,我可以为您完成这个任务。首先,我们需要使用 Python 爬取周杰伦歌曲的作词人和作曲人信息。这里我使用 BeautifulSoup 和 requests 库来实现。
```python
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://c.y.qq.com/soso/fcgi-bin/client_search_cp'
headers = {
'referer': 'https://y.qq.com/portal/search.html',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0'
}
params = {
'ct': '24',
'qqmusic_ver': '1298',
'new_json': '1',
'remoteplace': 'sizer.yqq.lyric_next',
'searchid': '101914397014461827',
't': '0',
'aggr': '1',
'cr': '1',
'catZhida': '1',
'lossless': '0',
'flag_qc': '0',
'p': '1',
'n': '10',
'w': '周杰伦',
'g_tk': '5381',
'loginUin': '0',
'hostUin': '0',
'format': 'json',
'inCharset': 'utf8',
'outCharset': 'utf-8',
'notice': '0',
'platform': 'yqq.json',
'needNewCode': '0'
}
res = requests.get(url, headers=headers, params=params)
json_response = res.json()
song_list = json_response['data']['song']['list']
lyricists = []
composers = []
for song in song_list:
name = song['name']
mid = song['mid']
song_url = f'https://c.y.qq.com/lyric/fcgi-bin/fcg_query_lyric_new.fcg?songmid={mid}&format=json&nobase64=1'
song_res = requests.get(song_url, headers=headers)
song_json = song_res.json()
if 'lyric' in song_json:
lyricist = song_json['lyric']
else:
lyricist = ''
if 'composer' in song_json:
composer = song_json['composer']
else:
composer = ''
lyricists.append(lyricist)
composers.append(composer)
df = pd.DataFrame({
'Lyricist': lyricists,
'Composer': composers
})
```
接下来,我们可以使用 matplotlib 库来生成柱形图。这里我使用了 Counter 和 re 库来统计作词人和作曲人的出现次数,并将结果绘制成柱形图。
```python
import matplotlib.pyplot as plt
from collections import Counter
import re
lyricist_counts = Counter(df['Lyricist'])
lyricist_counts = dict(filter(lambda x: x[1] > 1, lyricist_counts.items()))
lyricist_counts = dict(sorted(lyricist_counts.items(), key=lambda x: x[1], reverse=True)[:10])
composer_counts = Counter(df['Composer'])
composer_counts = dict(filter(lambda x: x[1] > 1, composer_counts.items()))
composer_counts = dict(sorted(composer_counts.items(), key=lambda x: x[1], reverse=True)[:10])
plt.figure(figsize=(10, 5))
plt.bar(range(len(lyricist_counts)), list(lyricist_counts.values()), align='center')
plt.xticks(range(len(lyricist_counts)), list(lyricist_counts.keys()), rotation=45)
plt.title('Top 10 Lyricists of Jay Chou')
plt.xlabel('Lyricist')
plt.ylabel('Number of songs')
plt.show()
plt.figure(figsize=(10, 5))
plt.bar(range(len(composer_counts)), list(composer_counts.values()), align='center')
plt.xticks(range(len(composer_counts)), list(composer_counts.keys()), rotation=45)
plt.title('Top 10 Composers of Jay Chou')
plt.xlabel('Composer')
plt.ylabel('Number of songs')
plt.show()
```
这样就可以得到周杰伦歌曲的作词人和作曲人的柱形图了。
阅读全文