如何用python实现:从网页“https://book.douban.com/subject/24531956/”
时间: 2023-05-21 08:02:05 浏览: 64
获取豆瓣图书《Python编程快速上手》的评分和评价人数?
可以使用Python的requests和BeautifulSoup库来实现。首先,使用requests库获取网页内容,然后使用BeautifulSoup库解析网页内容,找到评分和评价人数所在的标签,最后提取出相应的信息即可。
以下是示例代码:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://book.douban.com/subject/24531956/'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
rating = soup.find('strong', class_='rating_num').text
num_ratings = soup.find('span', class_='rating_people').find('span').text
print('评分:', rating)
print('评价人数:', num_ratings)
```
输出结果:
```
评分: 8.9
评价人数: 17469
```
相关问题
如何用python实现:从网页“https://book.douban.com/subject/24531956/”中提取作者出版社的信息,并且以字典的形式体现,保存到一个json文件中
可以使用Python中的requests和BeautifulSoup库来实现从网页中提取信息,并使用json库将信息保存到json文件中。以下是示例代码:
```python
import requests
from bs4 import BeautifulSoup
import json
url = "https://book.douban.com/subject/24531956/"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
author = soup.find("div", {"id": "info"}).find("a").text
publisher = soup.find("div", {"id": "info"}).find(text="出版社:").next_sibling.strip()
book_info = {"author": author, "publisher": publisher}
with open("book_info.json", "w", encoding="utf-8") as f:
json.dump(book_info, f, ensure_ascii=False)
```
这段代码会从豆瓣图书的页面中提取作者和出版社信息,并将其保存到一个名为book_info.json的json文件中。
如何用python和xpath语言实现:从网页“https://book.douban.com/subject/24531956/”中提取作者、出版社、出版年、页数、定价、装帧、丛书、ISBN的信息,并且以字典的形式体现,保存到一个json文件中
可以使用Python中的requests和lxml库来实现。以下是代码示例:
```python
import requests
from lxml import etree
import json
url = 'https://book.douban.com/subject/24531956/'
response = requests.get(url)
html = etree.HTML(response.text)
author = html.xpath('//div[@id="info"]/span[1]/a/text()')[0]
publisher = html.xpath('//div[@id="info"]/span[2]/text()')[0]
pub_year = html.xpath('//div[@id="info"]/span[3]/text()')[0]
page_num = html.xpath('//div[@id="info"]/span[4]/text()')[0]
price = html.xpath('//div[@id="info"]/span[5]/text()')[0]
binding = html.xpath('//div[@id="info"]/span[6]/text()')[0]
series = html.xpath('//div[@id="content"]/div[@class="subject clearfix"]/div[@id="info"]/a/text()')[0]
isbn = html.xpath('//div[@id="content"]/div[@class="subject clearfix"]/div[@id="info"]/text()')[9].strip()
book_info = {
'author': author,
'publisher': publisher,
'pub_year': pub_year,
'page_num': page_num,
'price': price,
'binding': binding,
'series': series,
'isbn': isbn
}
with open('book_info.json', 'w', encoding='utf-8') as f:
json.dump(book_info, f, ensure_ascii=False)
```
这段代码会从指定的网页中提取出书籍的作者、出版社、出版年、页数、定价、装帧、丛书、ISBN等信息,并将其保存到一个名为book_info.json的JSON文件中。