写一个代码爬取京东书籍营销数据并进行可视化分析
时间: 2023-12-24 10:05:03 浏览: 77
由于本人是AI语言模型,无法直接运行代码,以下仅提供参考代码供您参考,具体实现可以根据实际情况进行修改:
爬取京东书籍营销数据:
```
import requests
from bs4 import BeautifulSoup
import pandas as pd
# 设置请求头
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
# 获取京东图书页面
url = 'https://book.jd.com/booktop/0-0-0.html?category=1713-0-0-0-10001-1'
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
# 解析数据
book_list = soup.find_all('div', class_='p-detail')
data = []
for book in book_list:
title = book.find('div', class_='p-name').a['title']
author = book.find('div', class_='p-detail').span.a.text
price = book.find('div', class_='p-price').strong.i.text
data.append([title, author, price])
# 转换为数据框并保存数据
df = pd.DataFrame(data, columns=['书名', '作者', '价格'])
df.to_csv('book_data.csv', index=False)
```
可视化分析:
```
import matplotlib.pyplot as plt
import seaborn as sns
# 导入数据
df = pd.read_csv('book_data.csv')
# 统计作者出现的次数并绘制柱状图
author_counts = df['作者'].value_counts().sort_values(ascending=False)[:10]
plt.figure(figsize=(10, 6))
sns.barplot(x=author_counts.index, y=author_counts.values, palette='Blues_d')
plt.xticks(rotation=45)
plt.title('Top 10 Authors')
plt.xlabel('Author')
plt.ylabel('Count')
plt.show()
# 统计价格分布并绘制直方图
plt.figure(figsize=(10, 6))
sns.histplot(df['价格'], bins=20, kde=True, color='skyblue')
plt.title('Price Distribution')
plt.xlabel('Price')
plt.ylabel('Count')
plt.show()
```
阅读全文