首页写一个完整版代码，要求使用正则表达式获取湖州师范学院校园网http://www.zjhu.edu.cn/首页通知公告中最近20条公告的链接地址和新闻标题，并将结果保存在文本文件中。

写一个完整版代码，要求使用正则表达式获取湖州师范学院校园网http://www.zjhu.edu.cn/首页通知公告中最近20条公告的链接地址和新闻标题，并将结果保存在文本文件中。

时间: 2024-10-19 15:12:44 浏览: 20

为了完成这个任务，我们可以使用Python的`requests`, `BeautifulSoup`库来抓取网页内容，以及`re`模块来进行正则表达式的匹配。首先确保已经安装了这两个库，如果没有，可以使用pip install命令安装： ```bash pip install requests beautifulsoup4 ``` 下面是一个完整的Python代码示例，它会获取湖州师范学院校园网首页的通知公告链接和标题，并保存到文本文件中： ```python import re import requests from bs4 import BeautifulSoup # 定义目标URL url = "http://www.zjhu.edu.cn/" # 使用requests库获取网页内容 response = requests.get(url) # 检查请求是否成功 if response.status_code == 200: # 解析HTML内容 soup = BeautifulSoup(response.text, 'html.parser') # 查找通知公告部分，这里假设"通知公告"在一个特定的class名下，如".announcement" notices = soup.find_all(class_="announcement") # 提取公告信息并保存到列表中 results = [] for notice in notices[:20]: # 取前20条 title = notice.find('h2') or notice.find('span', class_='title') # 根据实际页面结构找标题 if title: title_text = title.text.strip() else: title_text = "无标题" link = notice.find('a') if link: link_url = link['href'] else: link_url = "无链接" results.append((link_url, title_text)) # 将结果保存到文本文件 with open("announcements.txt", "w", encoding='utf-8') as f: for result in results: f.write(f"标题：{result[1]}\n链接：{result[0]}\n\n") else: print(f"无法访问页面，状态码：{response.status_code}")

阅读全文