用python爬取新闻网页(URL:http://www.news.cn/politics/leaders/2023- 04/28/c_1129576764.htm)中的新闻文本。
时间: 2023-10-06 20:06:34 浏览: 153
可以使用 Python 中的 requests 和 BeautifulSoup 库来实现这个功能。
首先,使用 requests 库获取网页的 HTML 内容:
```python
import requests
url = 'http://www.news.cn/politics/leaders/2023-04/28/c_1129576764.htm'
response = requests.get(url)
html_content = response.content
```
然后,使用 BeautifulSoup 库解析 HTML 内容,提取新闻文本:
```python
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_content, 'html.parser')
news_content = soup.find('div', class_='article').text.strip()
```
最后,输出新闻文本:
```python
print(news_content)
```
完整代码如下:
```python
import requests
from bs4 import BeautifulSoup
url = 'http://www.news.cn/politics/leaders/2023-04/28/c_1129576764.htm'
response = requests.get(url)
html_content = response.content
soup = BeautifulSoup(html_content, 'html.parser')
news_content = soup.find('div', class_='article').text.strip()
print(news_content)
```
阅读全文