写一个爬取京东商品信息的爬虫
时间: 2024-03-28 08:06:59 浏览: 70
好的,下面是一个爬取京东商品信息的爬虫示例:
```python
import requests
from bs4 import BeautifulSoup
def get_jd_goods(keyword):
url = f'https://search.jd.com/Search?keyword={keyword}&enc=utf-8'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
resp = requests.get(url, headers=headers)
soup = BeautifulSoup(resp.text, 'html.parser')
goods_list = []
for li in soup.select('#J_goodsList > ul > li'):
try:
img_url = 'https:' + li.select('.p-img img')[0]['src']
name = li.select('.p-name em')[0].text.strip()
price = li.select('.p-price strong')[0].text
link = 'https:' + li.select('.p-name a')[0]['href']
goods_list.append({'name': name, 'price': price, 'img_url': img_url, 'link': link})
except:
pass
return goods_list
if __name__ == '__main__':
goods_list = get_jd_goods('手机')
for goods in goods_list:
print(goods)
```
运行以上代码,将输出京东中搜索关键词为“手机”的商品列表,包括商品名称、价格、图片链接和详情链接。该爬虫使用requests库发送HTTP请求,使用BeautifulSoup库解析HTML页面,提取所需的信息。同时,为了避免被反爬虫,我们设置了User-Agent头,模拟浏览器行为。
阅读全文