首页python爬虫爬取学校新闻

python爬虫爬取学校新闻

时间: 2023-12-29 15:25:51 浏览: 129

以下是使用urlopen和BeautifulSoup完成学校新闻标题爬取任务的示例代码： ```python from urllib.request import urlopen from bs4 import BeautifulSoup # 爬取地址 url = "https://www.hist.edu.cn/index/sy/kyyw.htm" # 发送HTTP请求并获取页面内容 response = urlopen(url) html = response.read() # 使用BeautifulSoup解析页面内容 soup = BeautifulSoup(html, "html.parser") # 爬取新闻列表汇总的地址链接和题目信息 news_list = soup.find_all("a", class_="news_title") news_info = [(news["href"], news.get_text()) for news in news_list] # 将爬取的信息存储到文件"xinwen.txt"中 with open("xinwen.txt", "w", encoding="utf-8") as f: for link, title in news_info: f.write(f"{link},{title}\n") # 打印爬取的信息 for link, title in news_info: print(f"链接：{link}，标题：{title}") ``` --相关问题: 1. 如何使用urllib库发送HTTP请求？ 2. BeautifulSoup库的作用是么？ 3. 如何将爬取的信息存储到文件中？

阅读全文