python爬取京东2023年手机销量排行榜
时间: 2023-07-12 09:05:31 浏览: 222
好的,我可以为您提供一些思路和代码示例。
首先,我们需要爬取京东手机分类下2023年销量排行榜的页面。可以使用Python的requests库和BeautifulSoup库来实现:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://list.jd.com/list.html?cat=9987,653,655&ev=exbrand_12803&sort=sort_rank_asc&trans=1&JL=6_0_0#J_crumbsBar'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
```
接下来,我们可以使用BeautifulSoup提供的方法来解析页面,找到包含销量排行榜的元素。
```python
rank_list = soup.find('div', {'class': 'rank-list'})
items = rank_list.find_all('li')
```
最后,我们可以遍历每个商品,获取商品名称和销量,并将其保存到一个列表中。
```python
result = []
for item in items:
name = item.find('div', {'class': 'p-name'}).text.strip()
sales = item.find('div', {'class': 'p-commit'}).text.strip().replace('条评价', '')
result.append({'name': name, 'sales': sales})
print(result)
```
完整代码如下:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://list.jd.com/list.html?cat=9987,653,655&ev=exbrand_12803&sort=sort_rank_asc&trans=1&JL=6_0_0#J_crumbsBar'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
rank_list = soup.find('div', {'class': 'rank-list'})
items = rank_list.find_all('li')
result = []
for item in items:
name = item.find('div', {'class': 'p-name'}).text.strip()
sales = item.find('div', {'class': 'p-commit'}).text.strip().replace('条评价', '')
result.append({'name': name, 'sales': sales})
print(result)
```
希望这个示例能够帮助您实现爬取京东销量排行榜的功能。
阅读全文