当当书籍可视化分析python代码
时间: 2023-07-22 07:48:03 浏览: 96
以下是一个简单的当当书籍可视化分析的Python代码示例:
```python
import requests
from bs4 import BeautifulSoup
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# 获取图书数据
url = "http://search.dangdang.com/?key=python&act=input&page_index=1"
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
books = soup.find_all("li", class_="bigimg")
# 数据清洗和处理
book_list = []
for book in books:
title = book.find("a", class_="pic").get("title")
author = book.find("div", class_="publisher_info").find_all("a")[0].text
publisher = book.find("div", class_="publisher_info").find_all("a")[1].text
price = float(book.find("span", class_="price_n").text.strip("¥"))
sales = int(book.find("span", class_="search_sales").text.strip("人收藏"))
book_list.append({"title": title, "author": author, "publisher": publisher, "price": price, "sales": sales})
df = pd.DataFrame(book_list)
# 可视化展示
plt.figure(figsize=(12, 6))
sns.barplot(x="publisher", y="sales", data=df)
plt.title("Python图书销售量排名前10的出版社")
plt.xlabel("出版社")
plt.ylabel("销售量")
plt.show()
```
以上代码通过爬取当当网上Python图书的数据,使用pandas对数据进行清洗和处理,使用seaborn和matplotlib进行可视化展示。具体地,展示了Python图书销售量排名前10的出版社的条形图。
阅读全文