首页pychram编写一个爬取http://data.cityhouse.cn/rank/index.html?

pychram编写一个爬取http://data.cityhouse.cn/rank/index.html?

时间: 2023-12-23 14:02:40 浏览: 70

好的，以下是一个简单的 Python 爬虫代码，可以用来爬取 http://data.cityhouse.cn/rank/index.html? 的数据，并将其保存到本地文件： ```python import requests from bs4 import BeautifulSoup # 发送 HTTP GET 请求 url = 'http://data.cityhouse.cn/rank/index.html?' response = requests.get(url) # 使用 BeautifulSoup 解析 HTML soup = BeautifulSoup(response.text, 'html.parser') # 提取数据行 rows = soup.find_all('tr') # 将数据行写入本地文件 with open('data.csv', 'w', encoding='utf-8') as f: for row in rows: cols = row.find_all('td') cols = [col.text.strip() for col in cols] f.write(','.join(cols) + '\n') ``` 你可以将代码保存为 `crawler.py` 文件，然后在命令行中运行 `python crawler.py` 来执行爬虫。执行完毕后，你会在当前目录下生成一个名为 `data.csv` 的文件，其中包含了网页上的所有数据。

阅读全文