python酒店爬取代码
时间: 2023-12-07 09:39:35 浏览: 36
以下是一个简单的Python爬取猫途鹰酒店信息的示例代码,其中使用了requests和BeautifulSoup库来获取和解析网页内容:
```python
import requests
from bs4 import BeautifulSoup
# 设置请求头,模拟浏览器访问
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
# 爬取的网页链接
url = 'https://www.tripadvisor.cn/Hotels-g60763-New_York_City_New_York-Hotels.html'
# 发送请求,获取网页内容
response = requests.get(url, headers=headers)
content = response.content
# 解析网页内容
soup = BeautifulSoup(content, 'html.parser')
hotel_list = soup.find_all('div', {'class': 'ui_column is-8 main_col allowEllipsis'})
# 遍历酒店列表,获取酒店信息
for hotel in hotel_list:
# 获取酒店名称
name = hotel.find('a', {'class': 'property_title prominent'}).text.strip()
# 获取酒店地址
address = hotel.find('div', {'class': 'prw_rup prw_common_atf_header_bl headerBL'}).text.strip()
# 获取酒店价格
try:
price = hotel.find('div', {'class': 'price-wrap'}).find('div', {'class': 'price'}).text.strip()
except AttributeError:
price = '暂无价格信息'
# 输出酒店信息
print('酒店名称:', name)
print('酒店地址:', address)
print('酒店价格:', price)
print('------------------------')
```