首页python怎么爬取跳转页面

python怎么爬取跳转页面

时间: 2023-09-29 22:08:27 浏览: 118

python 爬取网页

python 爬取网页爬取豆瓣网top250电影

要爬取跳转页面，可以使用 Python 中的 requests 库和 BeautifulSoup 库。首先，使用 requests 库获取页面的 HTML 内容，然后使用 BeautifulSoup 库解析 HTML。如果页面中有跳转链接，可以使用 requests 库中的 `get()` 方法来获取跳转页面的 HTML 内容。具体代码如下： ```python import requests from bs4 import BeautifulSoup # 获取原始页面的 HTML 内容 url = 'https://www.example.com' r = requests.get(url) html = r.content # 解析 HTML soup = BeautifulSoup(html, 'html.parser') # 获取跳转链接 link = soup.find('a')['href'] # 获取跳转页面的 HTML 内容 r = requests.get(link) html = r.content # 解析跳转页面的 HTML soup = BeautifulSoup(html, 'html.parser') ``` 在这个例子中，我们首先获取了原始页面的 HTML 内容，然后使用 BeautifulSoup 库解析 HTML。接着，我们通过 `soup.find('a')['href']` 获取了跳转链接，然后使用 requests 库的 `get()` 方法获取了跳转页面的 HTML 内容。最后，我们再次使用 BeautifulSoup 库解析跳转页面的 HTML 内容。

阅读全文