首页3.编写一个Python代码片段，获取网易新闻https://news.163.com 的首页内容并打印出来；使用正则表达式过滤出所有要闻标题。

3.编写一个Python代码片段，获取网易新闻https://news.163.com 的首页内容并打印出来；使用正则表达式过滤出所有要闻标题。

时间: 2024-12-10 08:38:27 浏览: 12

首先，我们需要使用requests库来获取网页内容，然后使用BeautifulSoup解析HTML以便于处理。由于网易新闻的页面结构可能会改变，这里我们假设标题位于`<h2>`或`<h3>`标签中，并且class名可能是"news_title"。以下是一个简单的Python代码片段： ```python import requests from bs4 import BeautifulSoup import re def get_news_titles(url): # 获取网页内容 response = requests.get(url) response.raise_for_status() # 检查请求是否成功 # 解析HTML内容 soup = BeautifulSoup(response.text, 'lxml') # 使用正则表达式过滤出标题 title_pattern = re.compile(r'<(h2|h3)\s*class=["\']news_title["\']>(.*?)</\1>', re.IGNORECASE | re.DOTALL) titles = title_pattern.findall(str(soup)) # 打印出所有的要闻标题 for i, title in enumerate(titles): print(f'Title {i+1}: {title[1]}') if __name__ == '__main__': url = 'https://news.163.com/' get_news_titles(url) ``` 注意：这个代码片段可能需要根据实际的网页布局调整正则表达式，因为网页的具体结构会随时间而变化。运行此代码前确保已经安装了`requests`和`beautifulsoup4`库，如果没有可以使用`pip install requests beautifulsoup4`命令进行安装。

阅读全文