首页python酒店爬取代码

python酒店爬取代码

时间: 2023-12-07 09:39:35 浏览: 36

以下是一个简单的Python爬取猫途鹰酒店信息的示例代码，其中使用了requests和BeautifulSoup库来获取和解析网页内容： ```python import requests from bs4 import BeautifulSoup # 设置请求头，模拟浏览器访问 headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'} # 爬取的网页链接 url = 'https://www.tripadvisor.cn/Hotels-g60763-New_York_City_New_York-Hotels.html' # 发送请求，获取网页内容 response = requests.get(url, headers=headers) content = response.content # 解析网页内容 soup = BeautifulSoup(content, 'html.parser') hotel_list = soup.find_all('div', {'class': 'ui_column is-8 main_col allowEllipsis'}) # 遍历酒店列表，获取酒店信息 for hotel in hotel_list: # 获取酒店名称 name = hotel.find('a', {'class': 'property_title prominent'}).text.strip() # 获取酒店地址 address = hotel.find('div', {'class': 'prw_rup prw_common_atf_header_bl headerBL'}).text.strip() # 获取酒店价格 try: price = hotel.find('div', {'class': 'price-wrap'}).find('div', {'class': 'price'}).text.strip() except AttributeError: price = '暂无价格信息' # 输出酒店信息 print('酒店名称：', name) print('酒店地址：', address) print('酒店价格：', price) print('------------------------') ```