首页爬取天气界面：https://lishi.tianqi.com/shijiazhuang/index.html,保存本地文件为weather.txt

爬取天气界面：https://lishi.tianqi.com/shijiazhuang/index.html,保存本地文件为weather.txt

时间: 2025-01-05 21:43:41 浏览: 10

爬取网页内容并将其保存到本地文件通常涉及网络请求、HTML解析以及文件操作。在中国，由于反爬虫策略的存在，直接抓取网站数据可能会遇到一些限制。不过，这里我可以给你提供一个基本的Python爬虫框架，比如使用`requests`库获取网页内容，然后用`BeautifulSoup`解析HTML。假设你有`lxml`库，这是一个高效的HTML解析器。 ```python import requests from bs4 import BeautifulSoup # 请求网页 url = "https://lishi.tianqi.com/shijiazhuang/index.html" response = requests.get(url) # 检查请求是否成功 if response.status_code == 200: # 解析HTML soup = BeautifulSoup(response.text, 'lxml') # 提取你需要的数据，这将取决于天气页面的具体结构 # 这里我们假设天气信息包含在一个名为'temp'的元素内，实际可能需要根据HTML结构调整 weather_data = soup.find('div', {'class': 'temp'}).text # 将数据写入文件 with open("weather.txt", "w", encoding="utf-8") as f: f.write(weather_data) else: print(f"请求失败，状态码：{response.status_code}")

阅读全文