抓取58同城石家庄二手房信息第一页数据https://sjz.58.com/ershoufang/p1/,存储标题、户型、平米数、小区名称、小区地点、总金额、每平米金额共7项数据,并存储至csv中
时间: 2024-12-11 08:44:31 浏览: 10
抓取网站数据并将其保存到CSV文件通常需要编程技能,这里我会给出一种基本的Python Scrapy框架的示例,用于模拟浏览器访问58同城的页面并提取所需信息。请注意,实际抓取可能会受到网站反爬虫策略的影响,有时可能需要安装额外的库如BeautifulSoup或Selenium,以及处理JavaScript渲染的内容。
```python
import requests
from bs4 import BeautifulSoup
import csv
# 模拟用户请求
url = "https://sjz.58.com/ershoufang/p1/"
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'
}
response = requests.get(url, headers=headers)
# 解析HTML内容
soup = BeautifulSoup(response.text, 'lxml')
# 提取字段
titles = [title.text for title in soup.find_all('h3', class_='title')]
house_types = [type.text for type in soup.find_all('span', class_='unit')]
square_meters = [int(size.text.split()[0]) for size in soup.find_all('span', class_='area') if 'm²' in size.text]
neighborhoods = [neighborhood.text for neighborhood in soup.find_all('div', class_='community-name')]
locations = [location.text for location in soup.find_all('p', class_='location')]
total_prices = [price.text for price in soup.find_all('i', class_='total-price')]
prices_per_square_meter = [float(price.split()[-1].replace(',', '')) for price in total_prices]
# 写入CSV文件
data = zip(titles, house_types, square_meters, neighborhoods, locations, total_prices, prices_per_square_meter)
with open('58同城石家庄二手房信息.csv', 'w', newline='', encoding='utf-8') as f:
writer = csv.writer(f)
writer.writerow(['标题', '户型', '平米数', '小区名称', '小区地点', '总金额', '每平米金额'])
writer.writerows(data)
阅读全文