如何使用Python爬取网页内容?
时间: 2024-03-28 21:31:41 浏览: 23
以下是使用Python爬取网页内容的步骤:
1.导入requests库和BeautifulSoup库
```python
import requests
from bs4 import BeautifulSoup
```
2.使用requests库获取网页内容
```python
url = 'https://www.example.com'
response = requests.get(url)
```
3.使用BeautifulSoup库解析网页内容
```python
soup = BeautifulSoup(response.text, 'html.parser')
```
4.使用BeautifulSoup库提取所需信息
```python
# 提取网页标题
title = soup.title.string
# 提取所有链接
links = []
for link in soup.find_all('a'):
links.append(link.get('href'))
# 提取所有段落
paragraphs = []
for paragraph in soup.find_all('p'):
paragraphs.append(paragraph.text)
```
完整代码如下:
```python
import requests
from bs4 import BeautifulSoup
url = 'https://www.example.com'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
# 提取网页标题
title = soup.title.string
# 提取所有链接
links = []
for link in soup.find_all('a'):
links.append(link.get('href'))
# 提取所有段落
paragraphs = []
for paragraph in soup.find_all('p'):
paragraphs.append(paragraph.text)
```
相关推荐
![rar](https://img-home.csdnimg.cn/images/20210720083606.png)
![docx](https://img-home.csdnimg.cn/images/20210720083331.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![7z](https://img-home.csdnimg.cn/images/20210720083312.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)